【发布时间】:2021-12-06 12:56:07
【问题描述】:
我在 s3 存储桶文件夹中有多个 JSON 文件,每个文件都具有与以下示例 JSON 对象数组/列表相同的模式。
file1
[{"coinRank":1,"coinId":"bitcoin","coinName":"Bitcoin","coinSymbol":"BTC","coinLoc":"bitcoin","coinPrice":53501.08,"coin1hrChange":-0.6,"coin24hrChange":-6.0,"coin7dChange":-9.2,"coin24hrVol":38266934579,"coinMarketCap":1012650219321,"fetchTime":"2021-12-03 23:55:42.654921","rankDate":"2021-12-03","rate":409.98,"coinPriceNaira":21934372.7784000002},{"coinRank":2,"coinId":"ethereum","coinName":"Ethereum","coinSymbol":"ETH","coinLoc":"ethereum","coinPrice":4225.28,"coin1hrChange":-0.3,"coin24hrChange":-7.2,"coin7dChange":-6.4,"coin24hrVol":27395766224,"coinMarketCap":502376237337,"fetchTime":"2021-12-03 23:55:42.655698","rankDate":"2021-12-03","rate":409.98,"coinPriceNaira":1732280.2944},{"coinRank":3,"coinId":"binancecoin","coinName":"Binance Coin","coinSymbol":"BNB","coinLoc":"binance-coin","coinPrice":593.95,"coin1hrChange":-0.7,"coin24hrChange":-4.9,"coin7dChange":-6.9,"coin24hrVol":2379210538,"coinMarketCap":100022794436,"fetchTime":"2021-12-03 23:55:42.656393","rankDate":"2021-12-03","rate":409.98,"coinPriceNaira":243507.621}]
file2
[{"coinRank":1,"coinId":"bitcoin","coinName":"Bitcoin","coinSymbol":"BTC","coinLoc":"bitcoin","coinPrice":52936.1,"coin1hrChange":-1.5,"coin24hrChange":-6.5,"coin7dChange":-1.7,"coin24hrVol":38241025550,"coinMarketCap":998999157967,"fetchTime":"2021-12-04 02:33:23.182164","rankDate":"2021-12-04","rate":409.98,"coinPriceNaira":21702742.2780000009},{"coinRank":2,"coinId":"ethereum","coinName":"Ethereum","coinSymbol":"ETH","coinLoc":"ethereum","coinPrice":4159.85,"coin1hrChange":-1.4,"coin24hrChange":-8.1,"coin7dChange":2.8,"coin24hrVol":28661534477,"coinMarketCap":493429600914,"fetchTime":"2021-12-04 02:33:23.182785","rankDate":"2021-12-04","rate":409.98,"coinPriceNaira":1705455.3030000003},{"coinRank":3,"coinId":"binancecoin","coinName":"Binance Coin","coinSymbol":"BNB","coinLoc":"binance-coin","coinPrice":582.32,"coin1hrChange":-1.9,"coin24hrChange":-5.4,"coin7dChange":-0.6,"coin24hrVol":1059743631,"coinMarketCap":97824378011,"fetchTime":"2021-12-04 02:33:23.183415","rankDate":"2021-12-04","rate":409.98,"coinPriceNaira":238739.5536}]
file3
[{"coinRank":1,"coinId":"bitcoin","coinName":"Bitcoin","coinSymbol":"BTC","coinLoc":"bitcoin","coinPrice":49375.27,"coin1hrChange":-0.7,"coin24hrChange":4.3,"coin7dChange":-9.5,"coin24hrVol":35860857801.0,"coinMarketCap":932932346783,"fetchTime":"2021-12-05 14:34:49.339803","rankDate":"2021-12-05","rate":410.764648,"coinPriceNaira":20281615.4014549591},{"coinRank":2,"coinId":"ethereum","coinName":"Ethereum","coinSymbol":"ETH","coinLoc":"ethereum","coinPrice":4218.99,"coin1hrChange":-0.7,"coin24hrChange":7.1,"coin7dChange":3.3,"coin24hrVol":27778808883.0,"coinMarketCap":500688046117,"fetchTime":"2021-12-05 14:34:49.340495","rankDate":"2021-12-05","rate":410.764648,"coinPriceNaira":1733011.9422655201},{"coinRank":3,"coinId":"binancecoin","coinName":"Binance Coin","coinSymbol":"BNB","coinLoc":"binance-coin","coinPrice":574.23,"coin1hrChange":-0.5,"coin24hrChange":5.2,"coin7dChange":-4.0,"coin24hrVol":2265817636.0,"coinMarketCap":96576091895,"fetchTime":"2021-12-05 14:34:49.341177","rankDate":"2021-12-05","rate":410.764648,"coinPriceNaira":235873.38382104}]
使用 AWS Glue 爬虫和分类器分离 JSON 对象$[*] 我已经拆分了记录,并且我可以确认数据目录中的记录数与文件中的记录数匹配。
但是,当我将数据推送到红移时,我有一些列显示为空。如有必要,我还可以分享我的胶水脚本。
【问题讨论】:
标签: amazon-redshift aws-glue-data-catalog