我正在尝试使用R中的插入符号包在数据集上实现随机林。查看此站点上的前面示例,我更改了列名和因子级别。似乎什么都没用。一次又一次地犯同样的错误。以下是我的代码、数据集结构和错误:
model_rf = train(Promoted ~ Department + Region+ Education+Gender+ RecruitmentChannel+TrainingNumber+Age+LengthOfService +EmployeePerformance+AvgTrainingPerformance, data=train, method='rf', tuneLength=5, trControl = fitControl)
model_rf
predicteds_rf <- predict(model_rf, newdata=test)
错误:至少有一个类级别无效
R变量名;当类概率为
因为变量名将转换为
不是。升职了,已升级。请使用可以用作
有效的R变量名(请参见?制造商名称寻求帮助)。
> str(trainData)
'data.frame': 54808 obs. of 12 variables:
$ EmployeeID : int 65438 65141 7513 2542 48945 58896 20379 16290 73202 28911 ...
$ Department : Factor w/ 9 levels "Analytics","Finance",..: 8 5 8 8 9 1 5 5 1 8 ...
$ Region : Factor w/ 34 levels "region_1","region_10",..: 32 15 11 16 19 12 13 28 13 1 ...
$ Education : Factor w/ 4 levels "","Bachelor's",..: 4 2 2 2 2 2 2 4 2 4 ...
$ Gender : Factor w/ 2 levels "f","m": 1 2 2 2 2 2 1 2 2 2 ...
$ RecruitmentChannel : Factor w/ 3 levels "other","referred",..: 3 1 3 1 1 3 1 3 1 3 ...
$ TrainingNumber : Factor w/ 5 levels "Average training",..: 5 5 5 3 5 3 5 5 5 5 ...
$ Age : Factor w/ 3 levels "Middle Age","Old",..: 1 3 1 1 1 1 1 1 3 1 ...
$ LengthOfService : Factor w/ 6 levels "Junior","Mid Level",..: 6 2 6 6 1 6 2 2 2 2 ...
$ EmployeePerformance : Factor w/ 7 levels "Average Performer",..: 4 3 5 7 5 5 5 5 3 4 ...
$ AvgTrainingPerformance: Factor w/ 6 levels "Average","Below Average",..: 5 1 2 2 4 6 2 1 6 2 ...
$ Promoted : Factor w/ 2 levels "Not Promoted",..: 1 1 1 1 1 1 1 1 1 1 ...