代码之家  ›  专栏  ›  技术社区  ›  Aaditya Ura

KeyError:Hugginface Trainer中的“eval_loss”

  •  0
  • Aaditya Ura  · 技术社区  · 2 年前

    我正试图用Hugginface框架构建一个问答管道,但面对 KeyError: 'eval_loss' 错误我的目标是最终训练和保存最好的模型,并评估加载模型的验证测试。我的教练配置如下:

    args = TrainingArguments(f'model_training',
                          evaluation_strategy="epoch",
                          label_names = ["start_positions", "end_positions"],
                          logging_steps = 1,
                          learning_rate=2e-5,
                          num_train_epochs=epochs,
                          save_total_limit = 2,
                          load_best_model_at_end=True,
                          save_strategy="epoch",
                          logging_strategy="epoch",
                          report_to="none",
                          weight_decay=0.01,
                          fp16=True,
                          push_to_hub=False)
    

    训练时,出现以下错误:

    Traceback (most recent call last):
      File "qa_pipe.py", line 286, in <module>
        pipe.training(train_d, val_d, epochs = 2)
      File "qa_pipe.py", line 263, in training
        self.trainer.train()
      File "/home/admin/qa/lib/python3.7/site-packages/transformers/trainer.py", line 1505, in train
        ignore_keys_for_eval=ignore_keys_for_eval,
      File "/home/admin/qa/lib/python3.7/site-packages/transformers/trainer.py", line 1838, in _inner_training_loop
        self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
      File "/home/admin/qa/lib/python3.7/site-packages/transformers/trainer.py", line 2090, in _maybe_log_save_evaluate
        self._save_checkpoint(model, trial, metrics=metrics)
      File "/home/admin/qa/lib/python3.7/site-packages/transformers/trainer.py", line 2193, in _save_checkpoint
        metric_value = metrics[metric_to_check]
    KeyError: 'eval_loss'
    

    上提供了最简单的工作示例 colab

    如何避免这种错误并最终保存最佳模型?

    0 回复  |  直到 2 年前