【发布时间】:2021-02-08 18:10:54
【问题描述】:
我刚刚在 Java 中登陆了我的第一个管道,并弹出以下错误。
Exception in thread "main" java.lang.IllegalArgumentException: No filesystem found for scheme gs
有以下代码。
pipeline.apply("ReadLines", TextIO.read().from(options.getInputFile()))
.apply(MapElements.via(new SampleFn()))
.apply("WriteLines", TextIO
.write()
.to(options.getOutputDir())
.withSuffix(".txt"));
从https://github.com/apache/beam/tree/master/examples/java 中的示例开始了一个临时项目,但似乎我可能缺少一些与 Maven 的依赖关系。
以下 .pom 提取是与 Beam 和 GCP 相关的依赖项。我错过了什么?
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-core</artifactId>
<version>2.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
<version>2.19.0</version>
<exclusions>
<exclusion>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</exclusion>
<exclusion>
<groupId>com.google.cloud.bigtable</groupId>
<artifactId>bigtable-client-core</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${guava.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-vendor-guava-20_0</artifactId>
<version>${beam-vendor-guava.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-google-cloud-platform-core</artifactId>
<version>2.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-protobuf</artifactId>
<version>2.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
<version>2.19.0</version>
</dependency>
编辑:阴影已经在执行。
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>${maven-shade-plugin.version}</version>
<executions>
<execution>
<id>sample-pipeline-build</id>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>sample-pipeline-bundled</finalName>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/LICENSE</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>my.project.SamplePipeline</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
编辑 2:捆绑 jar 中 META-INF/services/org.apache.beam.sdk.io.FileSystemRegistrar 的内容。
org.apache.beam.sdk.io.LocalFileSystemRegistrar
org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystemRegistrar
【问题讨论】:
标签: google-cloud-dataflow apache-beam