代码之家  ›  专栏  ›  技术社区  ›  roudan

多类分类问题的置换特征重要性

  •  0
  • roudan  · 技术社区  · 1 年前

    我想知道我们是否可以对多类分类问题进行置换特征重要性处理?

    from sklearn.inspection import permutation_importance
    metrics = ['balanced_accuracy', 'recall']
    pfi_scores = {}
    for metric in metrics:
        print('Computing permutation importance with {0}...'.format(metric))
        pfi_scores[metric] = permutation_importance(xgb, Xtst, ytst, scoring=metric, n_repeats=30, random_state=7)
    
    Cell In[5], line 10
          8 for metric in metrics:
          9     print('Computing permutation importance with {0}...'.format(metric))
    ---> 10     pfi_scores[metric] = permutation_importance(xgb, Xtst, ytst, scoring=metric, n_repeats=30, random_state=7)
    
    File c:\ProgramData\anaconda_envs\dash2\lib\site-packages\sklearn\utils\_param_validation.py:214, in validate_params.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
        208 try:
        209     with config_context(
        210         skip_parameter_validation=(
        211             prefer_skip_nested_validation or global_skip_validation
        212         )
        213     ):
    --> 214         return func(*args, **kwargs)
        215 except InvalidParameterError as e:
        216     # When the function is just a wrapper around an estimator, we allow
        217     # the function to delegate validation to the estimator, but we replace
        218     # the name of the estimator by the name of the function in the error
        219     # message to avoid confusion.
        220     msg = re.sub(
        221         r"parameter of \w+ must be",
        222         f"parameter of {func.__qualname__} must be",
        223         str(e),
        224     )
    ...
       (...)
       1528         UserWarning,
       1529     )
    
    ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
    

    然后我试着使用average='weighted',然后我仍然得到一个错误,说average='weighted'不可用。那么,如何将average='weighted'添加到permutation_importance()中进行多类分类呢?谢谢

    from sklearn.inspection import permutation_importance
    metrics = ['balanced_accuracy', 'recall']
    pfi_scores = {}
    for metric in metrics:
        print('Computing permutation importance with {0}...'.format(metric))
        pfi_scores[metric] = permutation_importance(xgb, Xtst, ytst, scoring=metric, n_repeats=30, random_state=7, average='weighted')
    
    TypeError: got an unexpected keyword argument 'average'
    
    1 回复  |  直到 1 年前
        1
  •  1
  •   dx2-66    1 年前

    这个 'recall' 字符串别名代表 recall_score(average='binary') 最简单的方法是使用sklearn提供的后缀版本:

    metrics = ['balanced_accuracy', 'recall_weighted']
    

    或者,你可以选择

    from sklearn.metrics import recall_score, make_scorer
    
    recall = make_scorer(recall_score, average='weighted')
    metrics = ['balanced_accuracy', recall]
    

    值得注意的是,平衡精度基本上是 'recall_macro' .