【发布时间】:2020-05-28 17:40:22
【问题描述】:
我正在使用 AWS Glue,如果不使用迭代,您将无法读取/写入多个动态帧。我在下面编写了这段代码,但在两件事上苦苦挣扎:
- “tableName”即过滤后的表列表是否正确(我要迭代的所有表都以 client_historical_* 开头)。
- 我不知道如何使用下面的映射动态填充 Redshift 表名称。
红移映射:
client_historical_ks --> table_01_a
client_historical_kg --> table_01_b
client_historical_kt --> table_01_c
client_historical_kf --> table_01_d
代码:
client = boto3.client('glue',region_name='us-east-1')
databaseName = 'incomingdata'
tables = client.get_tables(DatabaseName = databaseName)
tableList = tables['TableList']
for table in tableList:
start_prefix = client_historical_
tableName = list(filter(lambda x: x.startswith(start_prefix), table['Name']))
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "incomingdata", table_name = tableName, transformation_ctx = "datasource0")
datasink4 = glueContext.write_dynamic_frame.from_jdbc_conf(frame = datasource0, catalog_connection = "Redshift", connection_options = {"dbtable": "nameoftablehere", "database": "metadata"}, redshift_tmp_dir = args["TempDir"], transformation_ctx = "datasink4")
【问题讨论】:
-
start_prefix = client_historical_ ...你能把它放在引号里(start_prefix = 'client_historical_')然后试试。顺便说一句,这段代码的结果是它工作还是不工作?如果不起作用,您遇到的错误是什么,请添加更多信息
标签: python-3.x pyspark aws-glue