【问题标题】:java.lang.ClassNotFoundException: com.mysql.jdbc.Driver in Jupyter Notebook on Amazon EMRjava.lang.ClassNotFoundException: com.mysql.jdbc.Driver 在 Amazon EMR 上的 Jupyter Notebook 中
【发布时间】:2020-08-06 19:34:59
【问题描述】:

在尝试从 EMR Jupyter Notebook 连接到 RDS 中的 MySql 数据库时,我发现了以下错误:

使用的代码:

from pyspark.sql import SparkSession
hostname="hostname"
dbname = "mysql"
jdbcPort = 3306
username = "user"
password = "password"
jdbc_url = "jdbc:mysql://{0}:{1}/{2}?user={3}&password={4}".format(hostname,jdbcPort, dbname,username,password)
query = "(select * from framework.File_Columns) as table1"
df1 = spark.read.format('jdbc').options(driver = 'com.mysql.jdbc.Driver',url=jdbc_url, dbtable=query ).load()
df1.show()

错误信息:

调用 o89.showString 时出错。 :org.apache.spark.SparkException:作业因阶段失败而中止:阶段 0.0 中的任务 0 失败 4 次,最近一次失败:阶段 0.0 中丢失任务 0.3(TID 3,ip-172-31-37-50.us -west-2.compute.internal,执行器 1): java.lang.ClassNotFoundException: com.mysql.jdbc.Driver

我已将所需的mysql-connector-java-5.1.47.jar下载到/home/hadoop/mysql-connector-java-5.1.47.jar,并更新了Spark配置文件如下:

spark.master                     yarn

spark.driver.extraClassPath      :/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar

spark.driver.extraLibraryPath    /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar

spark.executor.extraClassPath    :/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar

spark.executor.extraLibraryPath  /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar

为了从 Jupyter Notebook 连接到 MySql DB,我还有什么需要做的吗?

【问题讨论】:

    标签: python pyspark jupyter-notebook amazon-emr mysql-connector


    【解决方案1】:

    由于从 Jupyter Notebook 运行时找不到驱动程序类,为避免这种情况,您可以尝试将 mysql-connector-java-5.1.47.jar 复制到 $SPARK_HOME/jars 文件夹。根据我的个人经验,它将解决您的驱动程序问题。

    【讨论】:

      【解决方案2】:

      你也可以这样做:

      spark.conf.set("jars", "s3://bucket-name/folder-name/mysql-connector-java-5.1.38-bin.jar")

      【讨论】:

        猜你喜欢
        • 2016-06-11
        • 2017-05-31
        • 1970-01-01
        • 2011-05-13
        • 1970-01-01
        • 2013-07-03
        相关资源
        最近更新 更多