【发布时间】:2021-12-08 13:23:46
【问题描述】:
我想使用sentence_transformers
但由于政策限制,我无法安装 package sentence-transformers
不过我有变压器和手电筒包。
我去了这个page 并尝试运行以下代码
在此之前,我去了page并下载了所有文件
import os
path="/yz/sentence-transformers/multi-qa-mpnet-base-dot-v1/" #local path where I have stored files
os.listdir(path)
['.dominokeep',
'config.json',
'data_config.json',
'modules.json',
'sentence_bert_config.json',
'special_tokens_map.json',
'tokenizer_config.json',
'train_script.py',
'vocab.txt',
'tokenizer.json',
'config_sentence_transformers.json',
'README.md',
'gitattributes',
'9e1e76b7a067f72e49c7f571cd8e811f7a1567bec49f17e5eaaea899e7bc2c9e']
我运行的代码是
from transformers import AutoTokenizer, AutoModel
import torch
# Load model from HuggingFace Hub
path="/yz/sentence-transformers/multi-qa-mpnet-base-dot-v1/"
"""tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-mpnet-base-dot-v1")
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-mpnet-base-dot-v1")"""
tokenizer = AutoTokenizer.from_pretrained(path)
model = AutoModel.from_pretrained(path)
我得到的错误如下
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-18-bb33f7c519e0> in <module>
32 model = AutoModel.from_pretrained("sentence-transformers/multi-qa-mpnet-base-dot-v1")"""
33
---> 34 tokenizer = AutoTokenizer.from_pretrained(path)
35 model = AutoModel.from_pretrained(path)
36
/usr/local/anaconda/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
308 config = kwargs.pop("config", None)
309 if not isinstance(config, PretrainedConfig):
--> 310 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
311
312 if "bert-base-japanese" in str(pretrained_model_name_or_path):
/usr/local/anaconda/lib/python3.6/site-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
342
343 if "model_type" in config_dict:
--> 344 config_class = CONFIG_MAPPING[config_dict["model_type"]]
345 return config_class.from_dict(config_dict, **kwargs)
346 else:
KeyError: 'mpnet'
我的问题:
- 我应该如何解决这个错误?
- 有没有办法对MiniLM-L6-H384-uncased使用相同的方法- .我想使用它,因为它似乎更快
=============================== 包版本如下 -
transformers - 4.0.0
torch - 1.4.0
【问题讨论】:
-
很快就会分享我的变形金刚版本。你能让 MiniLM-L6-H384-uncased 工作吗?
-
包版本是
transformers - 4.0.0 and torch - 1.4.0...你用的是哪个版本的转换器? -
MPnet 与转换器 4.1.0 一起添加。你能升级你的包吗?我没试过,但
MiniLM-L6-H384-uncased似乎是一个 BERT,你应该可以用 4.0.0 加载它。 -
你能试试
MiniLM-L6-H384-uncased吗?遇到问题...我可能无法更新我的软件包,MiniLM-L6-H384-uncased似乎是唯一的选择.. 我现在不记得了,但我想我能够让唯一的标记器为它工作.. .model = AutoModel.from_pretrained(path)失败 :(. -
您是对的,您会收到一条错误消息,因为 pytorch-model.bin 是使用新版本创建的:
RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:132)。也许你可以看看是否有人创建了转换脚本。
标签: nlp huggingface-transformers transformer sentence-similarity sentence-transformers