【发布时间】:2020-05-11 04:42:00
【问题描述】:
AWS EMR MetaException,消息:
global_temp 上的 Lake Formation 权限不足。
我正在尝试使用 spark 会话来使用 hive 元存储查询数据。这些表格用胶水编目。我可以使用spark.read.parquet 直接读取 s3 parquet 文件。但是,我想使用spark.sql 方法从 EMR 中查询目录。
当我跑线时:
spark.sql("show tables from gluedatabase").show()
我收到这样的错误:
'org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Insufficient Lake Formation permission(s) on global_temp (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException));'
Traceback (most recent call last):
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 767, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
raise AnalysisException(s.split(': ', 1)[2], stackTrace)
pyspark.sql.utils.AnalysisException: 'org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Insufficient Lake Formation permission(s) on global_temp (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException));'
【问题讨论】:
-
看来你需要给你的IAM user for Lake Formation授予权限
-
你可以试试这个github.com/aws-samples/aws-glue-samples/tree/master/utilities/… 吗?它应该有帮助
标签: pyspark amazon-emr aws-glue aws-glue-data-catalog hive-metastore