【发布时间】:2020-03-28 13:28:44
【问题描述】:
我正在尝试在 scala spark 中训练 xgboost 算法并使用多标签分类。我已经制作了一个输入列和输出列的向量
val vectorAssembler1 = new VectorAssembler().
setInputCols(Array("col1","col2","col3","col4","col5")).
setOutputCol("features")
val inputFeaturesVecDF = vectorAssembler1.transform(inputDF).
select("features","label1","label2","label3","label4")
val vectorAssembler2 = new VectorAssembler().
setInputCols(Array("label1","label2","label3","label4")).
setOutputCol("labels")
val xgbInputDF = vectorAssembler2.transform(inputFeaturesVecDF).select("features","labels")
我如下实例化模型并在其上运行拟合
val xgbClassifier = new XGBoostClassifier().
setFeaturesCol("features").
setLabelCol("labels").
setObjective("multi:softmax").
setMaxDepth(3).
setNumClass(4).
setNumRound(10).
setNumWorkers(8)
val xgbClassificationModel = xgbClassifier.fit(xgbInputDF)
当我运行它时,我收到以下错误。任何帮助将不胜感激
java.lang.IllegalArgumentException: requirement failed: Column labels must be of type numeric but was actually of type struct<type:tinyint,size:int,indices:array<int>,values:array<double>>.
【问题讨论】:
-
我也遇到了同样的问题,你能解决这个问题吗?
标签: scala apache-spark apache-spark-mllib xgboost