【问题标题】:ODBC configuration to connect to Spark Thrift Server连接到 Spark Thrift 服务器的 ODBC 配置
【发布时间】:2017-09-15 17:40:56
【问题描述】:

这个问题似乎重复了,实际上,我已经看到了几个与此相关的问题,但不完全是相同的错误,所以我想看看是否有人有线索。

我已经设置了一个使用默认设置运行的 Spark Thrift 服务器。 Spark 版本是 2.1,它在 YARN (Hadoop 2.7.3) 上运行

事实上,我既无法设置 Simba hive ODBC 驱动程序,也无法设置 Microsoft 驱动程序,因此 ODBC 设置中的测试成功。

这是我用于 Microsoft Hive ODBC 驱动程序的配置:

当我点击测试按钮时,显示的错误信息如下:

在 Spark Thrift Server 日志中可以看到以下内容:

17/09/15 17:31:36 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V1
17/09/15 17:31:36 INFO SessionState: Created local directory: /tmp/00abf145-2928-4995-81f2-fea578280c42_resources
17/09/15 17:31:36 INFO SessionState: Created HDFS directory: /tmp/hive/test/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SessionState: Created local directory: /tmp/vagrant/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SessionState: Created HDFS directory: /tmp/hive/test/00abf145-2928-4995-81f2-fea578280c42/_tmp_space.db
17/09/15 17:31:36 INFO HiveSessionImpl: Operation log session directory is created: /tmp/vagrant/operation_logs/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SparkExecuteStatementOperation: Running query 'set -v' with 82d7f9a6-f2a6-4ebd-93bb-5c8da1611f84
17/09/15 17:31:36 INFO SparkSqlParser: Parsing command: set -v
17/09/15 17:31:36 INFO SparkExecuteStatementOperation: Result Schema: StructType(StructField(key,StringType,false), StructField(value,StringType,false), StructField(meaning,StringType,false))

如果我通过 Beeline 使用 JDBC 驱动程序连接(工作正常),这些是日志:

17/09/15 17:04:24 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test
17/09/15 17:04:24 INFO SessionState: Created local directory: /tmp/c0681d6f-cc0f-40ae-970d-e3ea366aa414_resources
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SessionState: Created local directory: /tmp/vagrant/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test/c0681d6f-cc0f-40ae-970d-e3ea366aa414/_tmp_space.db
17/09/15 17:04:24 INFO HiveSessionImpl: Operation log session directory is created: /tmp/vagrant/operation_logs/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SparkSqlParser: Parsing command: use default
17/09/15 17:04:25 INFO HiveMetaStore: 1: get_database: default
17/09/15 17:04:25 INFO audit: ugi=vagrant   ip=unknown-ip-addr  cmd=get_database: default   
17/09/15 17:04:25 INFO HiveMetaStore: 1: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/09/15 17:04:25 INFO ObjectStore: ObjectStore, initialize called
17/09/15 17:04:25 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
17/09/15 17:04:25 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/09/15 17:04:25 INFO ObjectStore: Initialized ObjectStore

【问题讨论】:

    标签: apache-spark hive spark-thriftserver


    【解决方案1】:

    我通过安装 Microsoft Spark ODBC 驱动程序而不是 Hive 驱动程序成功连接。 看起来问题与驱动程序在发现它不是基于某些服务器属性的 Hive2 服务器时拒绝连接到 Spark Thrift 服务器有关。我怀疑 Hive2 和 Spark thrift 服务器之间的线路级别存在实际差异,因为后者是前者的端口,在协议级别(Thrift)没有更改,但无论如何,解决方案是移动到这个驱动程序并配置它与 Hive2 相同:

    Microsoft® Spark ODBC Driver

    【讨论】:

    • 与 Thrift 服务器的 http 选项连接任何 BI 工具(Power BI/tableau)相比,您是否遇到过上述设置的任何性能问题?
    猜你喜欢
    • 2018-08-19
    • 1970-01-01
    • 2018-01-18
    • 1970-01-01
    • 1970-01-01
    • 2018-01-21
    • 2021-07-30
    • 1970-01-01
    • 2015-10-27
    相关资源
    最近更新 更多