【问题标题】:Running simple example using Spark and Hive throws exception使用 Spark 和 Hive 运行简单示例会引发异常
【发布时间】:2021-10-02 00:45:07
【问题描述】:

按照these 的说明,我正在尝试运行一个简单的程序,它同时使用SparkHive

from pyspark.sql import SparkSession
appName = "PySpark Hive Example"
master = "local[*]"
spark = SparkSession.builder \
             .appName(appName) \
             .master(master) \
             .enableHiveSupport() \
             .getOrCreate()
# Read data using Spark
df = spark.sql("show databases")
df.show()

我得到了这个例外:

2021-07-25 20:46:21,477 WARN DataNucleus.Query: Query for candidates of org.apache.hadoop.hive.metastore.model.MConstraint and subclasses resulted in no possible candidates
Class "org.apache.hadoop.hive.metastore.model.MDatabase" field "org.apache.hadoop.hive.metastore.model.MDatabase.catalogName" : declared in MetaData, but this field doesnt exist in the class!
org.datanucleus.metadata.InvalidClassMetaDataException: Class "org.apache.hadoop.hive.metastore.model.MDatabase" field "org.apache.hadoop.hive.metastore.model.MDatabase.catalogName" : declared in MetaData, but this field doesnt exist in the class!
        at org.datanucleus.metadata.ClassMetaData.populateMemberMetaData(ClassMetaData.java:846)
        at org.datanucleus.metadata.ClassMetaData.populate(ClassMetaData.java:219)
        at org.datanucleus.metadata.MetaDataManagerImpl$1.run(MetaDataManagerImpl.java:2896)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.datanucleus.metadata.MetaDataManagerImpl.populateAbstractClassMetaData(MetaDataManagerImpl.java:2890)
        at org.datanucleus.metadata.MetaDataManagerImpl.populateFileMetaData(MetaDataManagerImpl.java:2689)
        at org.datanucleus.api.jdo.metadata.JDOMetaDataManager.loadXMLMetaDataForClass(JDOMetaDataManager.java:806)
        at org.datanucleus.api.jdo.metadata.JDOMetaDataManager.getMetaDataForClassInternal(JDOMetaDataManager.java:406)
        at org.datanucleus.metadata.MetaDataManagerImpl.getMetaDataForClass(MetaDataManagerImpl.java:1660)
        at org.datanucleus.metadata.MetaDataManagerImpl.getMetaDataForClass(MetaDataManagerImpl.java:1607)
        at org.datanucleus.metadata.AbstractClassMetaData.getReferencedClassMetaData(AbstractClassMetaData.java:1634)
        at org.datanucleus.metadata.AbstractClassMetaData.getReferencedClassMetaData(AbstractClassMetaData.java:1592)
        at org.datanucleus.metadata.MetaDataManagerImpl.getReferencedClassMetaData(MetaDataManagerImpl.java:3108)
        at org.datanucleus.metadata.MetaDataManagerImpl.getReferencedClasses(MetaDataManagerImpl.java:3078)
        at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTables(RDBMSStoreManager.java:2998)
        at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2886)
        at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
        at org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1627)
        at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:672)
        at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:425)
        at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:865)
        at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:347)
        at org.datanucleus.store.query.Query.executeQuery(Query.java:1816)
        at org.datanucleus.store.query.Query.executeWithArray(Query.java:1744)
        at org.datanucleus.store.query.Query.execute(Query.java:1726)
        at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374)
        at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216)
        at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.ensureDbInit(MetaStoreDirectSql.java:190)
        at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:144)
        at org.apache.hadoop.hive.metastore.ObjectStore.initializeHelper(ObjectStore.java:410)
        at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:342)
        at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:303)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:77)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStoreForConf(HiveMetaStore.java:628)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:594)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:588)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6902)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:70)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1707)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
        ...

NestedThrowablesStackTrace:
Identifier principalName is unresolved (not a static field)
org.datanucleus.exceptions.NucleusUserException: Identifier principalName is unresolved (not a static field)

不知道如何解决。

【问题讨论】:

    标签: apache-spark pyspark hive


    【解决方案1】:

    我在设置新的 Hive/Hadoop 集群时遇到了同样的问题。我正在使用嵌入在 Hive 服务器中的 Hive Metastore,并且能够通过设置一个单独的 Hive Metastore(如 The Wiki 中所述)来让 Spark 工作,然后将 $SPARK_CONF_DIR/hive-site.xml 中的 hive.metastore.uris 设置为那个的节俭 URL新的外部 Metastore。

    【讨论】:

    • 分享外部链接时,请务必在答案中引用相应的部分。链接将来可能会失效或过时,因此最好的做法是在您的答案中包含相关信息的“快照”。
    猜你喜欢
    • 2012-06-08
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多