代码之家  ›  专栏  ›  技术社区  ›  Anurag Sharma

更改配置并加载Hugging Face模型,在下游任务上进行微调

  •  0
  • Anurag Sharma  · 技术社区  · 4 年前

    我正在使用HuggingFace模型 TokenClassification 任务。我有以下标签ID映射。我使用的是3.3.0版本的库

    label2id = {
        "B-ADD": 4,
        "B-ARRESTED": 7,
        "B-CRIME": 2,
        "B-INCIDENT_DATE": 3,
        "B-SUSPECT": 9,
        "B-VICTIMS": 1,
        "B-WPN": 5,
        "I-ADD": 8,
        "I-ARRESTED": 13,
        "I-CRIME": 11,
        "I-INCIDENT_DATE": 10,
        "I-SUSPECT": 14,
        "I-VICTIMS": 12,
        "I-WPN": 6,
        "O": 0
      }
    

    以下场景运行良好,模型加载正确。

    from transformers import AutoModelForTokenClassification, AutoTokenizer, AutoConfig
    
    pretrained_model_name = "bert-base-cased"
    config = AutoConfig.from_pretrained(pretrained_model_name)
    
    id2label = {y:x for x,y in label2id.items()}
    config.label2id = label2id
    config.id2label = id2label
    config._num_labels = len(label2id)
    
    model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config)
    
    model
    

    我得到以下输出。最后一层已正确初始化为15个神经元(要预测的令牌类别的数量)。

    .....................
          (dropout): Dropout(p=0.1, inplace=False)
          (classifier): Linear(in_features=768, out_features=15, bias=True)
        )
    

    但如果我改变了 pretrained_model_name "dbmdz/bert-large-cased-finetuned-conll03-english" ,我收到以下错误

    loading weights file https://cdn.huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english/pytorch_model.bin from cache at C:\Users\anu10961/.cache\torch\transformers\4b02c1fe04cf7f7e6972536150e9fb329c7b3d5720b82afdac509bd750c705d2.6dcb154688bb97608a563afbf68ba07ae6f7beafd9bd98b5a043cd269fcc02b4
    All model checkpoint weights were used when initializing BertForTokenClassification.
    
    All the weights of BertForTokenClassification were initialized from the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english.
    If your task is similar to the task the model of the checkpoint was trained on, you can already use BertForTokenClassification for predictions without further training.
    
    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <ipython-input-15-2969a8092bf4> in <module>
    ----> 1 model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config)
    
    C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
       1372         if type(config) in MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING.keys():
       1373             return MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING[type(config)].from_pretrained(
    -> 1374                 pretrained_model_name_or_path, *model_args, config=config, **kwargs
       1375             )
       1376 
    
    C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
       1047                 raise RuntimeError(
       1048                     "Error(s) in loading state_dict for {}:\n\t{}".format(
    -> 1049                         model.__class__.__name__, "\n\t".join(error_msgs)
       1050                     )
       1051                 )
    
    RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
        size mismatch for classifier.weight: copying a param with shape torch.Size([9, 1024]) from checkpoint, the shape in current model is torch.Size([15, 1024]).
        size mismatch for classifier.bias: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([15]).
    

    我能看到的唯一区别是模型 dbmdz/bert-large-cased-finetuned-conll03-english 已微调 令牌分类 任务和it模型配置有以下内容 label2id 映射

    label2id = {
        "B-LOC": 7,
        "B-MISC": 1,
        "B-ORG": 5,
        "B-PER": 3,
        "I-LOC": 8,
        "I-MISC": 2,
        "I-ORG": 6,
        "I-PER": 4,
        "O": 0
      }
    

    但我仍然觉得我们可以改变这个模型的最后一层,并将其用于我的特定任务(尽管在将其用于推理之前,我需要先训练模型)

    0 回复  |  直到 4 年前
        1
  •  4
  •   Jindřich    4 年前

    一旦模型的一部分在保存的预训练模型中,您就无法更改其超参数。通过设置预训练的模型和配置,你说你想要一个分为15个类的模型,你想用一个使用9个类但不起作用的模型进行初始化。

    如果我理解正确的话,您想从不同的分类器初始化底层BERT。一个可行的解决方案是:

    1. 仅加载底层BERT,不加载分类层;
    2. 从头开始初始化分类模型;
    3. 将新分类器中随机初始化的BERT替换为预训练的BERT。
    from Transformers import AutoModel, AutoModelForTokenClassification
    bert = AutoModel.from_pretrained('dbmdz/bert-large-cased-finetuned-conll03-english')
    classifier = AutoModelForTokenClassification.from_config(config)
    classifier.bert = bert
    
        2
  •  0
  •   Zubaer    2 年前

    我相信你面临的问题可以通过这个论点来解决 ignore_mismatched_sizes=True 如下图所示

    model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config, ignore_mismatched_sizes=True)
    

    可以找到更多信息 here here

    我也在colab中检查了你的代码,遇到了类似的问题,然后当我添加 ignore_mismatched_sizes=真 ,它解决了我认为的问题。此外 RuntimeError 消息中包含了一行新内容(我认为以前没有)

    You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
    

    我希望它在一定程度上解决了这个问题,但我对transformers还比较陌生,还在探索这个库。

    谢谢:)