代码之家 › 专栏 › 技术社区 › Cindy

r:rf模型中的混淆矩阵返回错误:data和reference应该是相同级别的因子

confusion-matrix r-caret random-forest r

Cindy · 技术社区 · 7 年前

我是R中的新玩家,想解决二进制分类任务。

数据集有两个类的因子变量标签:first-0,second-1。下一张图片显示了它的实际头部: TimeDate列-它只是索引。类分布定义为:

print("the number of values with % in factor variable - LABELS:")
percentage <- prop.table(table(dataset$LABELS)) * 100
cbind(freq=table(dataset$LABELS), percentage=percentage)

班级分布结果:

另外,我知道slot2列是根据以下公式计算的:

Slot2 = Var3 - Slot3 + Slot4

在分析相关矩阵的基础上,选择特征var1、var2、var3、var4。

在开始建模之前,我将数据集划分为训练和测试部分。我尝试使用下一个代码为二进制分类任务构建随机林模型:

rf2 <- randomForest(LABELS ~ Var1 + Var2  + Var3 + Var4, 
                    data=train, ntree = 100,
                    mtry = 4, importance = TRUE)
print(rf2)

结果是:

  Call:
     randomForest(formula = LABELS ~ Var1 + Var2  + Var3 + Var4,
     data = train, ntree = 100,      mtry = 4, importance = TRUE) 

 Type of random forest: classification
 Number of trees: 100
 No. of variables tried at each split: 4

 OOB estimate of  error rate: 0.16%

 Confusion matrix:
           0      1 class.error
    0 164957    341 0.002062941
    1    280 233739 0.001196484

当我试图预测:

# Prediction & Confusion Matrix - train data
p1 <- predict(rf2, train, type="prob")
print("Prediction & Confusion Matrix - train data")
confusionMatrix(p1, train$LABELS)

# # Prediction & Confusion Matrix - test data
p2 <- predict(rf2, test, type="prob")
print("Prediction & Confusion Matrix - test data")
confusionMatrix(p2, test$LABELS)

我在r中收到一个错误:

[1] "Prediction & Confusion Matrix - train data"
Error: `data` and `reference` should be factors with the same levels.
Traceback:

1. confusionMatrix(p1, train$LABELS)
2. confusionMatrix.default(p1, train$LABELS)
3. stop("`data` and `reference` should be factors with the same levels.", 
 .     call. = FALSE)

另外,我已经试着用以下问题中的idea来解决它:

但这对我没有帮助。

你能帮我解决这个错误吗?

如有任何意见和建议,我将不胜感激。提前谢谢。

1 回复 | 直到 6 年前

Cindy 7 年前

R中的错误:

Error: `data` and `reference` should be factors with the same levels.

是通过改变类型中的参数预测功能,正确代码:

# Prediction & Confusion Matrix - train data
p1 <- predict(rf2, train, type="response")
print("Prediction & Confusion Matrix - train data")
confusionMatrix(p1, train$LABELS)

@卡米尔,非常感谢你)

推荐文章

Marc B. · 使用ggplot2创建条形图时“缺少值”

1 年前

deschen · tidyverse与外部向量发生突变,该外部向量的元素是数据帧中的列值

1 年前

Laura · 在Shiny中使用可排序的包拖放名称,这些名称将成为图表

1 年前

Mallikarjun M · 如何使用随机森林进行时间序列预测?

1 年前

ly li · 模型摘要:当表格形状改变时,拟合优度消失

1 年前

C.Robin · 将marginaffects::predictions()的结果连接回main df?

1 年前

monotonic · 如何将格式为“col1+col3+col4”的数据帧的行名转换为一列数字向量“c(1,3,4)”?

2 年前

Shawn Hemelstrand · 为什么我的自定义errorbar函数不能在R中工作?

2 年前

RoyBatty · 统计每个字符在整个数据集中出现的次数

2 年前

stats_noob · R: 记录某个“行为”发生的循环的索引?

2 年前