【问题标题】:spark elasticsearch: Multiple ES-Hadoop versions detected in the classpathspark elasticsearch:在类路径中检测到多个 ES-Hadoop 版本
【发布时间】:2019-01-20 11:16:30
【问题描述】:

我是新来的火花。我正在尝试运行将数据加载到弹性搜索的火花作业。我用我的代码构建了一个胖 jar,并在 spark-submit 期间使用它。

spark-submit \
  --class CLASS_NAME \
  --master yarn \
  --deploy-mode cluster \
  --num-executors 20 \
  --executor-cores 5 \
  --executor-memory 32G \
  --jars EXTERNAL_JAR_FILES \
  PATH_TO_FAT_JAR

elasticsearch-hadoop依赖的maven依赖是:

<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch-hadoop</artifactId>
    <version>5.6.10</version>
    <exclusions>
        <exclusion>
            <groupId>org.slf4j</groupId>
            <artifactId>log4j-over-slf4j</artifactId>
        </exclusion>
    </exclusions>
</dependency>

当我在EXTERNAL_JAR_FILES 列表中不包含elasticsearch-hadoop jar 文件时,我会收到此错误。

java.lang.ExceptionInInitializerError
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.spark.rdd.CompatUtils
  at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:344)
  at org.elasticsearch.hadoop.util.ObjectUtils.loadClass(ObjectUtils.java:73)
  ... 26 more

如果我将它包含在 EXTERNAL_JAR_FILES 列表中,我会收到此错误。

java.lang.Error: Multiple ES-Hadoop versions detected in the classpath; please use only one
jar:file:PATH_TO_CONTAINER/__app__.jar
jar:file:PATH_TO_CONTAINER/elasticsearch-hadoop-5.6.10.jar

  at org.elasticsearch.hadoop.util.Version.<clinit>(Version.java:73)
  at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:572)
  at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)
  at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:97)
  at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:97)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
  at org.apache.spark.scheduler.Task.run(Task.scala:108)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

有什么需要克服的吗?

【问题讨论】:

    标签: java apache-spark hadoop elasticsearch spark-submit


    【解决方案1】:

    通过在我构建的胖 jar 中不包含 elasticserach-hadoop jar 来解决问题。我在依赖项中提到了scope 参数到provided

        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch-hadoop</artifactId>
            <version>5.6.10</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>log4j-over-slf4j</artifactId>
                </exclusion>
            </exclusions>
            <scope>provided</scope>
        </dependency>
    

    【讨论】:

      【解决方案2】:

      我解决了这个问题

          <dependency>
              <groupId>org.elasticsearch</groupId>
              <artifactId>elasticsearch-hadoop</artifactId>
              <version>7.4.2</version>
              <scope>provided</scope>
          </dependency>
      

      注意[&lt;scope&gt;provided&lt;/scope&gt;]

      然后你可以使用命令:

      bin/spark-submit \
      --maste local[*] \
      --class xxxxx  \
      --jars https://repo1.maven.org/maven2/org/elasticsearch/elasticsearch-hadoop/7.4.2/elasticsearch-hadoop-7.4.2.jar \
      /your/path/xxxx.jar 
      

      【讨论】:

        【解决方案3】:

        我遇到了这个问题,因为我将项目的构建从 SBT 更改为 POM。在探索中。我看到类路径中有两个 jar,一个来自 .ivy2 文件夹,另一个来自 .mvn 我从 .ivy2 中删除了那个,问题就消失了。希望它可以帮助某人。

        【讨论】:

          猜你喜欢
          • 2018-10-18
          • 2019-01-28
          • 2023-02-24
          • 2021-03-22
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2015-05-19
          相关资源
          最近更新 更多