【发布时间】:2021-10-26 22:07:40
【问题描述】:
我有点困惑如何使用 huggingface transformers 输出来训练一个简单的语言二元分类器模型来预测阿尔伯特·爱因斯坦是否说过一句话。
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")
inputs = ["Hello World", "Hello There", "Bye Bye", "Two things are infinite: the universe and human stupidity; and I'm not sure about the universe."]
for input in inputs:
inputs = tokenizer(input, return_tensors="pt")
outputs = model(**inputs)
print(outputs[0].shape, input, len(input))
输出:
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
torch.Size([1, 4, 768]) Hello World 11
torch.Size([1, 4, 768]) Hello There 11
torch.Size([1, 4, 768]) Bye Bye 7
torch.Size([1, 23, 768]) Two things are infinite: the universe and human stupidity; and I'm not sure about the universe. 95
如您所见,输出的尺寸随输入的长度而变化。现在假设我想训练一个二元分类器来预测爱因斯坦是否说过输入句子,并且网络的输入将是 BERT transformer 的预测。
如何在 pytorch 中编写一个采用张量 [1, None, 768] 的 CNN 模型?似乎第二维随着输入的长度而变化。
【问题讨论】:
标签: python pytorch huggingface-transformers