【问题标题】:Scala Spark streaming fileStreamScala Spark 流文件流
【发布时间】:2023-03-17 14:09:01
【问题描述】:

类似于this question 我正在尝试使用fileStream,但收到关于类型参数的编译时错误。我正在尝试使用mahout-examples 提供的org.apache.mahout.text.wikipedia.XmlInputFormat 作为我的InputFormat 类型来摄取XML 数据。

val fileStream = ssc.fileStream[LongWritable, Text, XmlInputFormat](WATCHDIR)

编译错误是:

Error:(39, 26) type arguments [org.apache.hadoop.io.LongWritable,scala.xml.Text,org.apache.mahout.text.wikipedia.XmlInputFormat] conform to the bounds of none of the overloaded alternatives of
 value fileStream: [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean, conf: org.apache.hadoop.conf.Configuration)(implicit evidence$12: scala.reflect.ClassTag[K], implicit evidence$13: scala.reflect.ClassTag[V], implicit evidence$14: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and> [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean)(implicit evidence$9: scala.reflect.ClassTag[K], implicit evidence$10: scala.reflect.ClassTag[V], implicit evidence$11: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and> [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String)(implicit evidence$6: scala.reflect.ClassTag[K], implicit evidence$7: scala.reflect.ClassTag[V], implicit evidence$8: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)]
    val fileStream = ssc.fileStream[LongWritable, Text, XmlInputFormat](WATCHDIR)
                         ^
Error:(39, 26) wrong number of type parameters for overloaded method value fileStream with alternatives:
  [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean, conf: org.apache.hadoop.conf.Configuration)(implicit evidence$12: scala.reflect.ClassTag[K], implicit evidence$13: scala.reflect.ClassTag[V], implicit evidence$14: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and>
  [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean)(implicit evidence$9: scala.reflect.ClassTag[K], implicit evidence$10: scala.reflect.ClassTag[V], implicit evidence$11: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and>
  [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String)(implicit evidence$6: scala.reflect.ClassTag[K], implicit evidence$7: scala.reflect.ClassTag[V], implicit evidence$8: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)]
    val fileStream = ssc.fileStream[LongWritable, Text, XmlInputFormat](WATCHDIR)
                     ^

我对 Scala 很陌生,所以我对类型类并不是很熟悉(我假设这就是这里发生的事情?)。任何帮助将不胜感激。

【问题讨论】:

  • 你尝试过明确表达吗?它使用 scala.xml.Text,你应该使用 org.apache.hadoop.io.Text
  • 就是贾斯汀!类型错误。非常感谢!
  • 谢谢,作为答案发布。

标签: scala apache-spark spark-streaming


【解决方案1】:

它正在搜索scala.xml.Text 的例外列表,而您需要使用org.apache.hadoop.io.Text

【讨论】:

  • 如果编译器能更好地提示它正在寻找什么,那将会很有帮助。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2016-09-28
  • 1970-01-01
  • 1970-01-01
  • 2018-12-06
  • 2020-07-21
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多