【问题标题】:Exception in spark java火花java中的异常
【发布时间】:2017-03-29 19:57:39
【问题描述】:

我在 spark 中从本地机器读取文本文件目录。使用 spark-submit 运行它时出现以下异常

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/03/30 01:15:22 INFO SparkContext: Running Spark version 2.1.0
17/03/30 01:15:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/30 01:15:23 WARN Utils: Your hostname, Inspiron-N4050 resolves to a loopback address: 127.0.1.1; using 192.168.43.249 instead (on interface wlp9s0)
17/03/30 01:15:23 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/03/30 01:15:23 INFO SecurityManager: Changing view acls to: shakeel
17/03/30 01:15:23 INFO SecurityManager: Changing modify acls to: shakeel
17/03/30 01:15:23 INFO SecurityManager: Changing view acls groups to: 
17/03/30 01:15:23 INFO SecurityManager: Changing modify acls groups to: 
17/03/30 01:15:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(shakeel); groups with view permissions: Set(); users  with modify permissions: Set(shakeel); groups with modify permissions: Set()
17/03/30 01:15:23 INFO Utils: Successfully started service 'sparkDriver' on port 35160.
17/03/30 01:15:23 INFO SparkEnv: Registering MapOutputTracker
17/03/30 01:15:23 INFO SparkEnv: Registering BlockManagerMaster
17/03/30 01:15:23 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/03/30 01:15:23 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/03/30 01:15:23 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-ea876e3a-fd03-47df-b492-b6deccffe77d
17/03/30 01:15:23 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
17/03/30 01:15:23 INFO SparkEnv: Registering OutputCommitCoordinator
17/03/30 01:15:24 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/03/30 01:15:24 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.43.249:4040
17/03/30 01:15:24 INFO SparkContext: Added JAR file:/home/shakeel/workspace/geneselection/target/geneselection-0.0.1-SNAPSHOT.jar at spark://192.168.43.249:35160/jars/geneselection-0.0.1-SNAPSHOT.jar with timestamp 1490816724265
17/03/30 01:15:24 INFO Executor: Starting executor ID driver on host localhost
17/03/30 01:15:24 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40585.
17/03/30 01:15:24 INFO NettyBlockTransferService: Server created on 192.168.43.249:40585
17/03/30 01:15:24 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/03/30 01:15:24 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.43.249, 40585, None)
17/03/30 01:15:24 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.43.249:40585 with 366.3 MB RAM, BlockManagerId(driver, 192.168.43.249, 40585, None)
17/03/30 01:15:24 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.43.249, 40585, None)
17/03/30 01:15:24 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.43.249, 40585, None)
Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.spark.SparkContext.withScope(SparkContext.scala:701)
    at org.apache.spark.SparkContext.wholeTextFiles(SparkContext.scala:858)
    at org.apache.spark.api.java.JavaSparkContext.wholeTextFiles(JavaSparkContext.scala:224)
    at geneselection.AttributeSelector.run(AttributeSelector.java:229)
    at geneselection.AttributeSelector.main(AttributeSelector.java:213)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.7.5
    at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
    at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
    at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:730)
    at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
    at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
    ... 14 more
17/03/30 01:15:24 INFO SparkContext: Invoking stop() from shutdown hook
17/03/30 01:15:24 INFO SparkUI: Stopped Spark web UI at http://192.168.43.249:4040
17/03/30 01:15:24 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/03/30 01:15:24 INFO MemoryStore: MemoryStore cleared
17/03/30 01:15:24 INFO BlockManager: BlockManager stopped
17/03/30 01:15:24 INFO BlockManagerMaster: BlockManagerMaster stopped
17/03/30 01:15:24 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/03/30 01:15:24 INFO SparkContext: Successfully stopped SparkContext
17/03/30 01:15:24 INFO ShutdownHookManager: Shutdown hook called
17/03/30 01:15:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-966721ae-388b-476b-972e-8e108c1454d9

我不知道为什么会这样。我的计算机目录中有一些 csv 文件。产生此异常的代码是

public void run(String path){
        String master = "local[*]";
        SparkConf conf = new SparkConf().setAppName(AttributeSelector.class.getName()).setMaster(master);
        JavaSparkContext context = new JavaSparkContext(conf);
        try {

            context.wholeTextFiles("/home/shakeel/Parts/");
        } catch (Exception e) {
            e.printStackTrace();
        }
        System.out.println("Loaded files");
        context.close();
    }

我想读取 csv 文件并对每个文件执行特征选择,并将每个文件的结果存储在队列中以供进一步处理。为什么我会收到此异常?

我尝试以相同的方式运行示例字数统计应用程序,它运行良好。这是否与文件不是纯文本文件而是 csv 文件这一事实有关?

感谢任何帮助

【问题讨论】:

  • 您使用的是什么构建工具(maven、gradle、sbt 等)?
  • 我使用maven创建jar文件

标签: java csv apache-spark


【解决方案1】:

您遇到了 Jackson 版本冲突。要查看不兼容版本的来源,请从您的 maven 项目的顶级目录运行以下命令(您的 SCALA 版本将是 2.10 或 2.11)。

mvn dependency:tree -Dverbose -Dincludes=com.fasterxml.jackson.module

然后,一旦您找到导致问题的依赖项,请将其放入您的 pom 中相关工件的依赖项标记内。

<exclusions>
    <exclusion>
      <groupId>com.fasterxml.jackson.module</groupId>
      <artifactId>jackson-module-scala_(YOUR SCALA VERSION)</artifactId>
    </exclusion>
</exclusions> 

【讨论】:

  • 我在哪里可以找到导致问题的依赖项。我运行命令时得到的输出是:[INFO] Scanning for projects... [INFO] Building ParallelGeneSelection 0.0.1-SNAPSHOT [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ geneselection --- [INFO] BUILD SUCCESS [INFO] Total time: 4.743 s [INFO] Finished at: 2017-03-30T08:34:31+05:30 [INFO] Final Memory: 27M/591M
猜你喜欢
  • 2018-11-02
  • 2017-02-02
  • 2016-01-11
  • 1970-01-01
  • 1970-01-01
  • 2017-05-20
  • 1970-01-01
  • 2020-02-28
  • 1970-01-01
相关资源
最近更新 更多