Hive with Tez：无法从链中的任何提供商加载 AWS 凭证答案

【问题标题】：Hive with Tez: Unable to load AWS credentials from any provider in the chainHive with Tez：无法从链中的任何提供商加载 AWS 凭证
【发布时间】：2017-03-14 04:23:13
【问题描述】：

环境：Hadoop 2.7.3、hive-2.2.0-SNAPSHOT、Tez 0.8.4

我的 core-site.xml：

 <property>  
 <name>fs.s3a.aws.credentials.provider</name>  
 <value>  
 org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,   
 org.apache.hadoop.fs.s3a.BasicAWSCredentialsProvider,  
 com.amazonaws.auth.EnvironmentVariableCredentialsProvider  
</value>  
<property>  
<name>fs.s3a.impl</name>  
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>  
<description></description>  
</property>  
<property>  
<name>fs.s3a.access.key</name>  
<value>GOODKEYVALUE</value>  
<description>AWS access key ID. Omit for Role-based authentication.       </description>  
</property>  
<property>  
<name>fs.s3a.secret.key</name>  
<value>SECRETKEYVALUE</value>  
<description>AWS secret key. Omit for Role-based authentication.</description>  
</property>

我可以从 hadoop 命令行正确访问 s3a uri。我可以创建外部表和如下命令：

create external table mytable(a string, b string) location 's3a://mybucket/myfolder/';  
select * from mytable limit 20;

执行正确，但是

select count(*) from mytable;

失败：

Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1489267689011_0001_1_00, diagnostics=[Vertex vertex_1489267689011_0001_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: url_sum_master initializer failed, vertex=vertex_1489267689011_0001_1_00 [Map 1], com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain
        at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:131)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1110)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:759)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:723)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4194)
        at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:4949)
        at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:4923)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4178)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4141)
        at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1313)
        at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1270)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
        at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:258)
        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:365)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:483)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:196)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1489267689011_0001_1_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1489267689011_0001_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:393)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:250)
        at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)

我可以让它工作的唯一方法是如果我在 uri 本身中使用 accesskey:secretkey，这对于生产代码是不可能的。

谢谢。

【问题讨论】：

我应该提一下：在直线内，设置 fs.s3a.aws.credentials.provider；返回 core-site.xml 中列出的三个凭据提供程序，因此 hiveserver2 知道这些设置，但会从 SdkClientException 类引发异常。我的 Hadoop 是用 aws-java-sdk-1.11.93 编译的。 jackson jar 是 2.6.6，httpclient jar 是 4.5.2

标签： hadoop amazon-s3 hive

【解决方案1】：

您是对的，您不想在 URI 中包含秘密。很快 Hadoop 会告诉你不要这样做，在某些时候它可能会完全阻止它。

查看latest s3a docs 的故障排除 S3a 部分。

如果您自己构建 Hadoop（这意味着您的 SDK 版本选择），那么构建 Hadoop 2.8/2.9 并在 s3a 包中启动调试。那里有更多的安全记录，但仍然故意需要记录比您想要的更少的记录，以使这些密钥保密。

您也可以尝试在目标机器上设置 AWS 环境变量。这并不能解决问题，但它可以帮助隔离它。

【讨论】：

谢谢你，史蒂夫：我周末都在阅读你的答案——我真的很感激你的所作所为。不过，我不再认为这是 Hadoop 问题。从 Hadoop 命令行我可以读取和写入 s3a。我也可以在 S3 上运行 Tez jar：我认为问题出在 hiveserver2 上。我在 Hive 中的第一组错误与 Jackson jars 版本有关。修正了这一点，接下来是关于 httpclient jars (Hive 2.2SNAPSHOT 使用 4.5.2，Hadoop 使用 4.2.5) 。这不是 Hadoop 问题，如果 Tez 可以计算字数并将结果放入 s3 存储桶中，那么它可以访问凭据提供程序之一。

【解决方案2】：

我通过恢复到 Hive2.1.1 解决了这个问题。

我认为问题是 jar 版本不兼容。我的 hadoop-aws-2.7.3.jar 是使用 aws-java-sdk-1.11.93 编译的——而 Hive 引入了一个使用 aws 1.7.4 编译的版本。

【讨论】：

是的，您不能混合使用 AWS 版本。我应该将其添加到 hadoop 文档的故障排除部分，或者您可以，如果您想提交错误报告……个人轶事总是有助于阅读。