【问题标题】:Identifying repeated fields in SQL query识别 SQL 查询中的重复字段
【发布时间】:2014-01-03 16:07:26
【问题描述】:

我有一个返回如下列的 SQL 查询:

foo
-----------
1200
1200
1201
1200
1200
1202
1202
1202

它已经以特定方式排序,我想对该结果集执行另一个查询以标识重复数据,如下所示:

foo    ID
----   ----
1200   1
1200   1
1201   2
1200   3
1200   3
1202   4
1202   4
1202   4

重要的是,第二组 1200 与第一组是分开的。 OVER/PARTITION 的每一个变体似乎都想把两个组放在一起。有没有办法将分区窗口化到这些重复的组?

编辑: 这适用于 Microsoft SQL Server 2012

【问题讨论】:

  • 您使用的是哪个 DBMS?后格雷斯?甲骨文?
  • 您的结果集有多大?您是否考虑过遍历结果集并创建 Id?
  • 结果集很小,最多几百个。我没有考虑循环,我会谷歌它。
  • 是的,在这种情况下尝试光标。
  • 返回该结果集的 SQL 查询是什么?为了争论,我们可以假设它是SELECT col1 FROM YourTable ORDER BY col2吗?你也可以调整那个查询吗?

标签: sql sql-server-2012


【解决方案1】:

不确定这是最快的结果...

select main.num, main.id from
(select x.num,row_number() 
over (order by (select 0)) as id 
from (select distinct num from num) x) main
join 
(select num, row_number() over(order by (select 0)) as ordering
 from num) x2 on 
x2.num=main.num
order by x2.ordering

假设表“num”有一个包含您的数据的列“num”,按顺序 - 当然 num 可以成为原始查询的视图或“with”。

请看下面sqlfiddle

【讨论】:

    【解决方案2】:

    这是一种不用CURSOR的方法

    -- Create a temporay table
    DECLARE @table TABLE 
    (
        SeqID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
        foo INT,    
        id int null 
    )
    
    
    DECLARE @i INT
    DECLARE @j INT
    DECLARE @k INT
    declare @tFoo INT
    declare @oldFoo INT
    
    SET @k = 0
    set @oldFoo = 0
    
    
    
    -- Insert data into the temporary table
    INSERT INTO @table(foo) 
    SELECT 1200
    INSERT INTO @table(foo) 
    SELECT 1200
    INSERT INTO @table(foo) 
    SELECT 1201
    INSERT INTO @table(foo) 
    SELECT 1200
    INSERT INTO @table(foo) 
    SELECT 1200
    INSERT INTO @table(foo) 
    SELECT 1202
    INSERT INTO @table(foo) 
    SELECT 1202
    INSERT INTO @table(foo) 
    SELECT 1202
    
    -- Get the max and min SeqIDs to loop through
    SELECT @i = MIN(SeqID) FROM @table
    SELECT @j = MAX(SeqID) FROM @table
    
    
    -- Loop through the temp table using the SeqID indentity column
    WHILE (@i <= @j)
    BEGIN
      SELECT @tFoo = foo FROM @table WHERE SeqID = @i
    
      if @oldFoo <> @tFoo
        set @k = @k + 1 
    
      update @table set id = @k where SeqID = @i    
    
      SET @oldFoo = @tFoo
      -- Increment the counter
      SET @i = @i + 1
    END
    
    SELECT * from @table
    

    【讨论】:

      【解决方案3】:

      你可以不使用游标来做到这一点,但它看起来不太好(至少我想出的)。所以 1)我假设你有 PK 列来排序你的主要值。 然后 2) 我假设您有一个要设置的 ID 列。

          create table tbl(foo int, pk int, id int);
      
          insert into tbl(foo, pk) values (1100, 5);
          insert into tbl(foo, pk) values (1200, 10);
          insert into tbl(foo, pk) values (1200, 20);
          insert into tbl(foo, pk) values (1201, 30);
          insert into tbl(foo, pk) values (1200, 40);
          insert into tbl(foo, pk) values (1200, 50);
      
          insert into tbl(foo, pk) values (1202, 60);
          insert into tbl(foo, pk) values (1202, 70);
          insert into tbl(foo, pk) values (1202, 80);
          insert into tbl(foo, pk) values (1202, 90);
      

      这里的 SQL 小提琴:http://sqlfiddle.com/#!6/fdaaa/2

          update tbl
          set
          ID = 1 
      
      
      
          update t
      
          set
      
          t.ID = m.RN2 
      
          from
      
          tbl t 
      
          join 
      
          (
      
      
          select
      
          y1.RN as RN1, y1.PK as PK1,
          y2.RN as RN2, y2.PK as PK2
      
          FROM
      
          (
      
          SELECT
      
          ROW_NUMBER() OVER(ORDER BY x.pk1 ASC) AS rn,
      
          x.pk1 AS pk
      
          FROM
          (
          SELECT t1.pk AS pk1, t2.pk AS pk2
          FROM
          tbl t1
          LEFT JOIN tbl t2 ON
          (
          (t1.pk < t2.pk AND t1.foo = t2.foo)
          AND
          (
            NOT EXISTS
            (
              SELECT tMid.pk FROM
              tbl tMid WHERE
              tMid.pk < t2.pk
              AND
              tMid.pk > t1.pk
            )
          )
          )
          ) x WHERE x.pk2 IS NULL
      
      
          ) y1
      
      
          left join 
      
      
      
      
          (
      
          SELECT
      
          ROW_NUMBER() OVER(ORDER BY x.pk1 ASC) AS rn,
      
          x.pk1 AS pk
      
          FROM
          (
          SELECT t1.pk AS pk1, t2.pk AS pk2
          FROM
          tbl t1
          LEFT JOIN tbl t2 ON
          (
          (t1.pk < t2.pk AND t1.foo = t2.foo)
          AND
          (
            NOT EXISTS
            (
              SELECT tMid.pk FROM
              tbl tMid WHERE
              tMid.pk < t2.pk
              AND
              tMid.pk > t1.pk
            )
          )
          )
          ) x WHERE x.pk2 IS NULL
      
      
          ) y2 on y1.RN = y2.RN - 1 
      
          ) m on 
      
          (
          (t.pk > m.pk1 and ((m.pk2 is not null ) and (t.pk <= m.pk2)))
          -- or
          -- (t.pk<=m.pk1)
          )
      

      【讨论】:

      • 如果这里的一些 SQL 大师可以尝试这个并让我知道它在逻辑上是否正确,我将不胜感激。对我来说它看起来不错,我非常认真地测试了它。
      【解决方案4】:

      这是我使用游标和临时表保存结果的解决方案。

      DECLARE @foo INT
      DECLARE @previousfoo INT = -1
      DECLARE @id INT = 0
      DECLARE @getid CURSOR
      
      DECLARE @resultstable TABLE 
      (
          primaryId INT IDENTITY(1, 1) NOT NULL PRIMARY KEY,
          foo INT,    
          id int null 
      )
      
      SET @getid = CURSOR FOR
      SELECT originaltable.foo
      FROM   originaltable
      
      OPEN @getid
      FETCH NEXT
      FROM @getid INTO @foo
      WHILE @@FETCH_STATUS = 0
      BEGIN
          IF (@foo <> @previousfoo)
          BEGIN
              SET @id = @id + 1
          END
      
          INSERT INTO @resultstable VALUES (@foo, @id)
          SET @previousfoo = @foo
      
          FETCH NEXT
          FROM @getid INTO @foo
      END
      
      CLOSE @getid
      DEALLOCATE @getid
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2015-09-28
        • 2011-01-07
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2012-11-03
        相关资源
        最近更新 更多