【发布时间】:2021-03-22 06:25:58
【问题描述】:
我正在使用从 s3 读取 JSON 文件的外部表。 我已使用此查询将数据从外部表传输到内部表
insert into jatinanalysis (title,url,author,published_date,category)
select b.title,b.link,b.author,b.published_date,category
FROM jatinspectrum.extable a, a.enteries b,b.category category
但是每次我运行这个查询时它都会创建重复的,我只想要新的查询,现有的应该被忽略。
更新:尝试了 https://stackoverflow.com/a/656027/13126651 但没有运气
insert into jatinanalysis (title,url,author,published_date,category)
select distinct b.title,b.link,b.author,b.published_date,category
FROM jatinspectrum.extable a, a.enteries b,b.category category
WHERE NOT EXISTS(SELECT *
FROM jatinanalysis
WHERE (jatinspectrum.extable.b.title=jatinanalysis.title and
jatinspectrum.extable.b.link=jatinanalysis.url and
jatinspectrum.extable.b.author=jatinanalysis.author and
jatinspectrum.extable.b.b.published_date=jatinanalysis.published_date and
jatinspectrum.extable.category=jatinanalysis.category)
【问题讨论】:
标签: sql amazon-web-services amazon-redshift amazon-redshift-spectrum