我正在使用HuggingFace模型
TokenClassification
任务。我有以下标签ID映射。我使用的是3.3.0版本的库
label2id = {
"B-ADD": 4,
"B-ARRESTED": 7,
"B-CRIME": 2,
"B-INCIDENT_DATE": 3,
"B-SUSPECT": 9,
"B-VICTIMS": 1,
"B-WPN": 5,
"I-ADD": 8,
"I-ARRESTED": 13,
"I-CRIME": 11,
"I-INCIDENT_DATE": 10,
"I-SUSPECT": 14,
"I-VICTIMS": 12,
"I-WPN": 6,
"O": 0
}
以下场景运行良好,模型加载正确。
from transformers import AutoModelForTokenClassification, AutoTokenizer, AutoConfig
pretrained_model_name = "bert-base-cased"
config = AutoConfig.from_pretrained(pretrained_model_name)
id2label = {y:x for x,y in label2id.items()}
config.label2id = label2id
config.id2label = id2label
config._num_labels = len(label2id)
model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config)
model
我得到以下输出。最后一层已正确初始化为15个神经元(要预测的令牌类别的数量)。
.....................
(dropout): Dropout(p=0.1, inplace=False)
(classifier): Linear(in_features=768, out_features=15, bias=True)
)
但如果我改变了
pretrained_model_name
到
"dbmdz/bert-large-cased-finetuned-conll03-english"
,我收到以下错误
loading weights file https://cdn.huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english/pytorch_model.bin from cache at C:\Users\anu10961/.cache\torch\transformers\4b02c1fe04cf7f7e6972536150e9fb329c7b3d5720b82afdac509bd750c705d2.6dcb154688bb97608a563afbf68ba07ae6f7beafd9bd98b5a043cd269fcc02b4
All model checkpoint weights were used when initializing BertForTokenClassification.
All the weights of BertForTokenClassification were initialized from the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use BertForTokenClassification for predictions without further training.
RuntimeError Traceback (most recent call last)
<ipython-input-15-2969a8092bf4> in <module>
C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1372 if type(config) in MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING.keys():
1373 return MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING[type(config)].from_pretrained(
-> 1374 pretrained_model_name_or_path, *model_args, config=config, **kwargs
1375 )
1376
C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1047 raise RuntimeError(
1048 "Error(s) in loading state_dict for {}:\n\t{}".format(
-> 1049 model.__class__.__name__, "\n\t".join(error_msgs)
1050 )
1051 )
RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([9, 1024]) from checkpoint, the shape in current model is torch.Size([15, 1024]).
size mismatch for classifier.bias: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([15]).
我能看到的唯一区别是模型
dbmdz/bert-large-cased-finetuned-conll03-english
已微调
令牌分类
任务和it模型配置有以下内容
label2id
映射
label2id = {
"B-LOC": 7,
"B-MISC": 1,
"B-ORG": 5,
"B-PER": 3,
"I-LOC": 8,
"I-MISC": 2,
"I-ORG": 6,
"I-PER": 4,
"O": 0
}
但我仍然觉得我们可以改变这个模型的最后一层,并将其用于我的特定任务(尽管在将其用于推理之前,我需要先训练模型)