首先,
roc_auc_score
函数需要具有相同形状的输入参数。
sklearn.metrics.roc_auc_score(y_true, y_score, average=âmacroâ, sample_weight=None)
Note: this implementation is restricted to the binary classification task or multilabel classification task in label indicator format.
y_true : array, shape = [n_samples] or [n_samples, n_classes]
True binary labels in binary label indicators.
y_score : array, shape = [n_samples] or [n_samples, n_classes]
Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by âdecision_functionâ on some classifiers).
现在,输入的是真实的和预测的分数,而不是培训和标签数据,正如您在发布的示例中使用的那样。
更详细地说,
model.fit(X_important_train, y_train)
model.score(X_important_train, y_train)
# this is wrong here
roc_auc_score(X_important_train, y_train)
你应该这样做:
y_pred = model.predict(X_test_data)
roc_auc_score(y_true, y_pred)