如何解决谷歌 colab 中的 ResourceExhaustedError答案

【问题标题】：How to resolve ResourceExhaustedError in google colab如何解决谷歌 colab 中的 ResourceExhaustedError
【发布时间】：2021-07-01 00:13:07
【问题描述】：

我在 google colab 上使用 roberta 的问答模型来处理推文情感提取问题。

但模型无法训练，因为我收到了 Resourceexhaustederror；

查看完整错误：

ResourceExhaustedError:  OOM when allocating tensor with shape[32,16,128,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
     [[node model/tf_roberta_model/roberta/encoder/layer_._17/attention/self/transpose (defined at /usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_tf_roberta.py:218) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
 [Op:__inference_train_function_112984]...

在此处查看模型：

ids = Input((MAX_LEN,), dtype=tf.int32)
att = Input((MAX_LEN,), dtype=tf.int32)

bert_model = TFRobertaModel.from_pretrained('roberta-large')

x = bert_model(ids, attention_mask= att)

x1 = Dropout(0.1)(x[0])
x1 = Conv1D(1,1)(x1)
x1 = Flatten()(x1)
x1 = Activation('softmax')(x1)


x2 = Dropout(0.1)(x[0])
x2 = Conv1D(1,1)(x2)
x2 = Flatten()(x2)
x2 = Activation('softmax')(x2)


model = Model(inputs = [ids, att], outputs = [x1, x2])

如有任何帮助解决此错误，我们将不胜感激。

【问题讨论】：

标签： python tensorflow deep-learning gpu google-colaboratory

【解决方案1】：

根据我的经验，您可以使用Gradient Accumulation 技术。或者，如果您可以设法使用Google Colab Pro，那么更好的选择。根据文档

使用 Colab Pro，您可以优先访问高内存虚拟机。这些 VM 通常具有标准 Colab VM 的两倍内存和两倍的 CPU。订阅后，您将能够访问笔记本设置以启用高内存 VM。此外，当 Colab 检测到您可能需要它时，有时可能会自动为您分配一个高内存 VM。 Colab Pro 虚拟机通常还附带标准 Colab 虚拟机的两倍磁盘。但是，资源并不能得到保证，而且对高内存 VM 有使用限制。

在 Colab 的免费版本中，高内存首选项不可用，并且很少会自动为用户分配高内存 VM。

这些 Transformer 模型需要大量内存，因此使用 Colab Pro 非常方便。但是，您也可以使用 Colab 中也提供的 TPU 加速器，但请注意它比 Kaggle TPU 慢得多。

【讨论】：