【发布时间】:2016-09-10 05:21:03
【问题描述】:
我读到了类似的post related 来解决这个问题,但我担心这个错误代码是由其他原因引起的。我有一个包含 8 个观察值和 10 个变量的 CSV 文件:
> str(rorIn)
'data.frame': 8 obs. of 10 variables:
$ Acuity : Factor w/ 3 levels "Elective ","Emergency ",..: 1 1 2 2 1 2 2 3
$ AgeInYears : int 49 56 77 65 51 79 67 63
$ IsPriority : int 0 0 1 0 0 1 0 1
$ AuthorizationStatus: Factor w/ 1 level "APPROVED ": 1 1 1 1 1 1 1 1
$ iscasemanagement : Factor w/ 2 levels "N","Y": 1 1 2 1 1 2 2 2
$ iseligible : Factor w/ 1 level "Y": 1 1 1 1 1 1 1 1
$ referralservicecode: Factor w/ 4 levels "12345","278",..: 4 1 3 1 1 2 3 1
$ IsHighlight : Factor w/ 1 level "N": 1 1 1 1 1 1 1 1
$ RealLengthOfStay : int 25 1 1 1 2 2 1 3
$ Readmit : Factor w/ 2 levels "0","1": 2 1 2 1 2 1 2 1
我这样调用算法:
library("C50")
rorIn <- read.csv(file = "RoRdataInputData_v1.6.csv", header = TRUE, quote = "\"")
rorIn$Readmit <- factor(rorIn$Readmit)
fit <- C5.0(Readmit~., data= rorIn)
然后我得到:
> source("~/R-workspace/src/RoR/RoR/testing.R")
c50 code called exit with value 1
>
我正在遵循其他建议,例如: - 使用因子作为决策变量 - 避免空数据
对此有任何帮助吗?我读到这是机器学习的最佳算法之一,但我一直收到此错误。
这是原始数据集:
Acuity,AgeInYears,IsPriority,AuthorizationStatus,iscasemanagement,iseligible,referralservicecode,IsHighlight,RealLengthOfStay,Readmit
Elective ,49,0,APPROVED ,N,Y,SNF ,N,25,1
Elective ,56,0,APPROVED ,N,Y,12345,N,1,0
Emergency ,77,1,APPROVED ,Y,Y,OBSERVE ,N,1,1
Emergency ,65,0,APPROVED ,N,Y,12345,N,1,0
Elective ,51,0,APPROVED ,N,Y,12345,N,2,1
Emergency ,79,1,APPROVED ,Y,Y,278,N,2,0
Emergency ,67,0,APPROVED ,Y,Y,OBSERVE ,N,1,1
Urgent ,63,1,APPROVED ,Y,Y,12345,N,3,0
提前感谢您的帮助,
大卫
【问题讨论】:
-
你的数据是不是太小了?你有比观察更多的变量,这可能是一个问题。
-
p > n 是个问题,但即便如此,数据也相对较小。通常不建议尝试用这么少的观察结果创建一个稳健的模型。
标签: r machine-learning decision-tree