【发布时间】:2022-01-23 20:30:19
【问题描述】:
我正在尝试构建一个 apache flink 作业,该作业必须通过 HDFS 访问文件。它在本地运行良好,但是当我将作业提交到 flink 集群时,出现错误:
Hadoop is not in the classpath/dependencies.
我正在使用 Maven shade 插件来构建我的 job.jar。 Flink 集群没有 Hadoop jar,所以我必须将它们全部添加到作业本身。
在本地,我必须在我的 IDE 设置中添加选项“将具有“提供”范围的依赖项添加到类路径以使其工作,但我不知道如何使用 maven 来做到这一点。
pom.xml
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${dep.flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>${dep.flink.version}</version>
<scope>provided</scope>
</dependency>
...
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${plugin.maven-compiler.version}</version>
<configuration>
<source>${project.build.targetJdk}</source>
<target>${project.build.targetJdk}</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>${plugin.maven-shade.version}</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<minimizeJar>true</minimizeJar>
<relocations>
<relocation>
<pattern>org.apache.commons.cli</pattern>
<shadedPattern>org.test.examples.thirdparty.commons_cli</shadedPattern>
</relocation>
</relocations>
<filters>
<!-- Filters out signed files to avoid SecurityException when integrating a signed jar in the resulting jar. -->
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
【问题讨论】:
标签: java maven hadoop apache-flink