【问题标题】:Why "java.lang.ClassNotFoundException: Failed to find data source: kinesis" with spark-streaming-kinesis-asl dependency?为什么“java.lang.ClassNotFoundException:无法找到数据源:kinesis”与 spark-streaming-kinesis-asl 依赖?
【发布时间】:2018-11-29 08:09:01
【问题描述】:

我的设置:

  scala:2.11.8
  spark:2.3.0.cloudera4

我已经在我的.pom 文件中添加了这个:

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-streaming-kinesis-asl_2.11</artifactId>
  <version>2.3.0</version>
</dependency>

但是,当我运行我的 spark-streaming 代码来使用来自 kinesis 的数据时,它会返回:

Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: kinesis.

当我使用来自Kafka 的数据时,我遇到了类似的错误,并通过在提交命令中指示依赖 jar 来解决它。但是这次好像不行了:

sudo -u hdfs spark2-submit --packages org.apache.spark:spark-streaming-kinesis-asl_2.11:2.3.0 --class com.package.newkinesis --master yarn  sparktest-1.0-SNAPSHOT.jar 

如何解决这个问题?任何帮助表示赞赏。

我的代码:

val spark = SparkSession
      .builder.master("local[4]")
      .appName("SpeedTester")
      .config("spark.driver.memory", "3g")
      .getOrCreate()

    val kinesis = spark.readStream
      .format("kinesis")
      .option("streamName", kinesisStreamName)
      .option("endpointUrl", kinesisEndpointUrl)
      .option("initialPosition", "TRIM_HORIZON")
      .option("awsAccessKey", awsAccessKeyId)
      .option("awsSecretKey", awsSecretKey)
      .load()

    kinesis.writeStream.format("console").start().awaitTermination()

我的完整.pom 文件:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.netease</groupId>
  <artifactId>sparktest</artifactId>
  <version>1.0-SNAPSHOT</version>
  <inceptionYear>2008</inceptionYear>
  <properties>
    <scala.version>2.11.8</scala.version>
  </properties>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.1</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <includes>
                                <include>org/apache/spark/*</include>
                            </includes>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

  <dependencies>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
        <scope>provided</scope>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.11</artifactId>
        <scope>provided</scope>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.11</artifactId>
        <scope>provided</scope>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka-clients</artifactId>
      <version>2.1.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kinesis-asl_2.11</artifactId>
      <version>2.3.0</version>
    </dependency>
  </dependencies>
</project>

【问题讨论】:

标签: scala apache-spark amazon-kinesis spark-structured-streaming


【解决方案1】:

tl;dr这行不通。

您将 spark-streaming-kinesis-asl_2.11 依赖项用于旧的 Spark Streaming API 和新的 Spark Structured Streaming,因此出现异常。

您必须为 AWS Kinesis 找到兼容的 Spark Structured Streaming 数据源,而 Apache Spark 项目并未正式支持该数据源。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-10-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-11-01
    相关资源
    最近更新 更多