代码之家  ›  专栏  ›  技术社区  ›  stone rock

值错误:使用sklearn roc_auc_score函数不支持多类多输出格式

  •  3
  • stone rock  · 技术社区  · 7 年前

    我正在使用 logistic regression 用于预测。我的预测是 0's 1's . 在对我的模型进行给定数据的培训之后,以及在对重要特性(即 X_important_train 请参见屏幕截图。我的得分在70%左右,但当我使用 roc_auc_score(X,y) roc_auc_score(X_important_train, y_train) 我正在获取值错误: ValueError: multiclass-multioutput format is not supported

    代码:

    # Load libraries
    from sklearn.linear_model import LogisticRegression
    from sklearn import datasets
    from sklearn.preprocessing import StandardScaler
    from sklearn.metrics import roc_auc_score
    
    # Standarize features
    scaler = StandardScaler()
    X_std = scaler.fit_transform(X)
    
    # Train the model using the training sets and check score
    model.fit(X, y)
    model.score(X, y)
    
    model.fit(X_important_train, y_train)
    model.score(X_important_train, y_train)
    
    roc_auc_score(X_important_train, y_train)
    

    截图:

    enter image description here

    1 回复  |  直到 7 年前
        1
  •  3
  •   seralouk    7 年前

    首先, roc_auc_score 函数需要具有相同形状的输入参数。

    sklearn.metrics.roc_auc_score(y_true, y_score, average=’macro’, sample_weight=None)
    
    Note: this implementation is restricted to the binary classification task or multilabel classification task in label indicator format.
    
    y_true : array, shape = [n_samples] or [n_samples, n_classes]
    True binary labels in binary label indicators.
    
    y_score : array, shape = [n_samples] or [n_samples, n_classes]
    Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).
    

    现在,输入的是真实的和预测的分数,而不是培训和标签数据,正如您在发布的示例中使用的那样。 更详细地说,

    model.fit(X_important_train, y_train)
    model.score(X_important_train, y_train)
    # this is wrong here
    roc_auc_score(X_important_train, y_train)
    

    你应该这样做:

    y_pred = model.predict(X_test_data)
    roc_auc_score(y_true, y_pred)