【问题标题】:Does Amazon S3 in Scality support S3AFileSystem to interact with hadoop?Scality 中的 Amazon S3 是否支持 S3AFileSystem 与 hadoop 交互?
【发布时间】:2018-11-14 19:36:42
【问题描述】:

我制作了一个 spark 代码,用于复制文件夹并将其放入 Amazon S3 独立存储桶。该过程运行良好,但现在我尝试将相同的过程用于在 Scality 上运行的 Amazon S3 存储桶。这是我的配置。

spark-submit --name "Backup S3 Test" --master yarn-cluster  --executor-memory 2048m --num-executors 6 --executor-cores 2 --driver-memory 1024m --keytab /home/bigdata/userbcks3.keytab 
--principal XXXXXXX@XXXXXXXX 
--deploy-mode cluster 
--conf spark.file.replicate.exclusion.regexps="" 
--conf spark.hadoop.fs.s3a.access.key=XXXXXXXXXX 
--conf spark.hadoop.fs.s3a.secret.key=XXXXXXXXXX 
--class com.keedio.hadoop.FileReplicator hdfs-file-processors-1.1.6-SNAPSHOT.jar /pre/mydata/ s3a://mybucket/

现在例外

om.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)

        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)

        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)

        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)

        at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1306)

        at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1263)

        at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:323)

        ... 20 more

Caused by: com.amazonaws.SdkClientException: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/

        at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:115)

        at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:77)

        at com.amazonaws.auth.InstanceProfileCredentialsProvider$InstanceMetadataCredentialsEndpointProvider.getCredentialsEndpoint(InstanceProfileCredentialsProvider.java:156)

        at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:121)

        at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)

        at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:141)

        at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:129)

为了制作副本,我只是使用 apache FileUtils,它允许我在 DistributedFileSystem 和 S3AFileSystem 之间移动文件。 有什么办法可以让它在相同的过程中工作?也许我缺少任何配置参数?

【问题讨论】:

    标签: apache-spark hadoop amazon-s3


    【解决方案1】:

    无论您正在运行什么程序,它都没有获取 fs.s3a.access.key/secret.key 值,正在通过其他身份验证选项(环境变量、EC2 元数据服务器)运行并且失败。您还没有与远端进行通信。

    如果您的代码以前工作过,并且它在 EC2 中运行,那么它可能总是让您登录的元数据服务器...

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-04-16
      • 1970-01-01
      • 1970-01-01
      • 2021-05-25
      • 1970-01-01
      相关资源
      最近更新 更多