【问题标题】:TSQL Keep valid duplicates and remove invalid duplicatesTSQL 保留有效重复并删除无效重复
【发布时间】:2013-09-29 10:49:04
【问题描述】:

我已经为此苦恼了一段时间,而且进展不快;数据必须保持在行级别。

我想保留最早到达的数据,重复有效。 Load1 代表一个批次ID。并非所有值都有重复项

我要返回的东西

Code1   Code2   Code3   Load1   LoadTime
a1      a1      a1      1       2013-09-10
a1      a1      a1      1       2013-09-10
a1      a1      a1      1       2013-09-10
a2      a1      a1      2       2013-09-12
a1      a2      a1      3       2013-09-13
a1      a2      a1      3       2013-09-13

有什么建议吗?

 CREATE TABLE #Test (
 Code1  varchar(10),
 Code2  varchar(10),
 Code3  varchar(10),
 Load1  varchar(10),
 LoadTime DATE
 )


  INSERT INTO #Test
  VALUES ('a1','a1','a1','1','2013-09-10') --Keep

  INSERT INTO #Test
  VALUES ('a1','a1','a1','1','2013-09-10') --Keep

  INSERT INTO #Test
  VALUES ('a1','a1','a1','1','2013-09-10') --Keep

  INSERT INTO #Test
  VALUES ('a1','a1','a1','2','2013-09-11') --Delete

  INSERT INTO #Test
  VALUES ('a2','a1','a1','2','2013-09-12') --Keep

  INSERT INTO #Test
  VALUES ('a2','a1','a1','3','2013-09-13') --Delete

  INSERT INTO #Test
  VALUES ('a1','a2','a1','3','2013-09-13') --Keep

  INSERT INTO #Test
  VALUES ('a1','a2','a1','3','2013-09-13') --Keep

  INSERT INTO #Test
  VALUES ('a1','a2','a1','4','2013-09-13')-- Delete

  INSERT INTO #Test
  VALUES ('a1','a2','a1','4','2013-09-13')-- Delete

【问题讨论】:

  • 什么是无效副本?
  • 我意识到我问的问题很糟糕。我将不得不重写它。谢谢

标签: sql sql-server tsql sql-server-2012 duplicate-removal


【解决方案1】:

你可以使用SQL Servercommon table expression or CTE

with cte as (
    select
        dense_rank() over(partition by Code1, Code2, Code3 order by LoadTime, Load1 asc) as rn
    from Table1
)
delete from cte where rn > 1

sql fiddle demo

其实这个查询在 SQL Server 中很简单,因为 SQL Server 把简单的公用表表达式当作可更新的视图——你不必在你原来的表上加入 cte,你可以delete from cte

【讨论】:

    【解决方案2】:

    您可能想查看row_number()dense_rank()

    很难说出从示例数据中删除或保留的逻辑,但类似

    ;with cte as (
          select *, 
          dense_rank() over (partition by code1,code2,code3 order by loadtime) rn 
          from #test)
        delete #Test
        from #Test t
            inner join cte
                on t.Code1 = cte.Code1
                and t.Code2 = cte.Code2
                and t.Code3 = cte.Code3
                and t.Load1 = cte.Load1
                and t.LoadTime = cte.LoadTime
            where rn>1
    

    (如果您的数据具有唯一 ID,则连接会更容易)

    【讨论】:

      猜你喜欢
      • 2020-05-14
      • 2015-09-27
      • 1970-01-01
      • 2013-07-10
      • 1970-01-01
      • 2011-02-02
      • 2016-12-28
      • 2016-09-18
      • 1970-01-01
      相关资源
      最近更新 更多