【问题标题】:How to pass multiple columns in setLabelCol to xgboost in spark mllib?如何将 setLabelCol 中的多列传递给 spark mllib 中的 xgboost?
【发布时间】:2020-03-28 13:28:44
【问题描述】:

我正在尝试在 scala spark 中训练 xgboost 算法并使用多标签分类。我已经制作了一个输入列和输出列的向量

val vectorAssembler1 = new VectorAssembler().
                        setInputCols(Array("col1","col2","col3","col4","col5")).
                        setOutputCol("features")
val inputFeaturesVecDF = vectorAssembler1.transform(inputDF).
                                                   select("features","label1","label2","label3","label4")

val vectorAssembler2 = new VectorAssembler().
                        setInputCols(Array("label1","label2","label3","label4")).
                        setOutputCol("labels")
val xgbInputDF = vectorAssembler2.transform(inputFeaturesVecDF).select("features","labels")

我如下实例化模型并在其上运行拟合

val xgbClassifier = new XGBoostClassifier().
                        setFeaturesCol("features").
                        setLabelCol("labels").
                        setObjective("multi:softmax").
                        setMaxDepth(3).
                        setNumClass(4).
                        setNumRound(10).
                        setNumWorkers(8)
val xgbClassificationModel = xgbClassifier.fit(xgbInputDF)

当我运行它时,我收到以下错误。任何帮助将不胜感激

java.lang.IllegalArgumentException: requirement failed: Column labels must be of type numeric but was actually of type struct<type:tinyint,size:int,indices:array<int>,values:array<double>>.

【问题讨论】:

  • 我也遇到了同样的问题,你能解决这个问题吗?

标签: scala apache-spark apache-spark-mllib xgboost


【解决方案1】:

XGboost 不提供多标签分类。

【讨论】:

    猜你喜欢
    • 2018-10-29
    • 2017-10-24
    • 2017-08-01
    • 2016-03-06
    • 1970-01-01
    • 1970-01-01
    • 2019-12-07
    • 1970-01-01
    • 2021-12-06
    相关资源
    最近更新 更多