【发布时间】:2019-08-21 17:09:02
【问题描述】:
我在下面写了函数
object AgeClassification {
def AgeCategory(age:Int) : String = {
if(age<=30)
return "Young"
else if(age>=65)
return "Older"
else
return "Mid-age"
}
}
我正在尝试将数据框列作为参数传递
val df_new = df
.withColumn("Age_Category", AgeClassification.AgeCategory(df("age")))
但收到错误
:33: 错误:类型不匹配;
找到:org.apache.spark.sql.Column
必需:整数
val df_new = df.withColumn("Age_Category",AgeClassification.AgeCategory(df("age")))
如何将列作为参数传递?
val df_new = df
.withColumn("Age_Category",AgeClassification.AgeCategory(df.age.cast(IntegerType)))
:33: 错误:值 age 不是 org.apache.spark.sql.DataFrame 的成员
val df_new = df.withColumn("Age_Category",AgeClassification.AgeCategory(df.age.cast(IntegerType)))
val df_new = df
.withColumn("Age_Category", AgeClassification.AgeCategory(df("age").cast(Int)))
:33: 错误:重载的方法值强制转换:
(to: String)org.apache.spark.sql.Column
(至:org.apache.spark.sql.types.DataType)org.apache.spark.sql.Column
不能应用于 (Int.type)
val df_new = df.withColumn("Age_Category",AgeClassification.AgeCategory(df("age").cast(Int)))
【问题讨论】:
标签: scala apache-spark