这看起来确实是一个孤岛问题。
这是一种方法。它可能会比您的变体更快。
gaps-and-islands 的标准想法是生成两组行号,以两种方式对它们进行分区。这样的行号(rn1-rn2)之间的差异在每个连续的块中将保持不变。运行 CTE-by-CTE 下面的查询并检查中间结果以查看发生了什么。
WITH
CTE_RN
AS
(
SELECT
[ValueId]
,[ListId]
,[ValueDelta]
,[ValueCreated]
,ROW_NUMBER() OVER (PARTITION BY ListID ORDER BY ValueCreated) AS rn1
,ROW_NUMBER() OVER (PARTITION BY ListID, [ValueDelta] ORDER BY ValueCreated) AS rn2
FROM [Value]
)
SELECT
ListID
,MIN(ValueID) AS FirstID
,MAX(ValueID) AS LastID
,MIN(ValueCreated) AS FirstCreated
,MAX(ValueCreated) AS LastCreated
,ValueDelta
,COUNT(*) AS ValueCount
FROM CTE_RN
GROUP BY
ListID
,ValueDelta
,rn1-rn2
ORDER BY
FirstCreated
;
此查询在您的示例数据集上产生与您相同的结果。
尚不清楚FirstID 和LastID 是否可以是MIN 和MAX,或者它们确实必须来自第一行和最后一行(按ValueCreated 排序时)。如果你真的需要第一个和最后一个,查询会变得有点复杂。
在您的原始样本数据集中,FirstID 的“first”和“min”是相同的。让我们稍微改变一下样本数据集以突出这种差异:
insert into [Value]
([ListId], [ValueDelta], [ValueCreated])
values
(1, 1, '2019-01-01 01:01:02'), -- 1.1
(1, 0, '2019-01-01 01:02:01'), -- 2.1
(1, 0, '2019-01-01 01:03:01'), -- 2.2
(1, 0, '2019-01-01 01:04:01'), -- 2.3
(1, -1, '2019-01-01 01:05:01'), -- 3.1
(1, -1, '2019-01-01 01:06:01'), -- 3.2
(1, 1, '2019-01-01 01:01:01'), -- 1.2
(1, 1, '2019-01-01 01:08:01'), -- 4.2
(2, 1, '2019-01-01 01:08:01') -- 5.1
;
我所做的只是在第一行和第七行之间交换 ValueCreated,所以现在第一组的 FirstID 是 7 和 LastID 是 1。您的查询返回正确的结果。我上面的简单查询没有。
这是产生正确结果的变体。我决定使用FIRST_VALUE 和LAST_VALUE 函数来获取适当的ID。再次运行查询 CTE-by-CTE 并检查中间结果以查看发生了什么。
即使使用调整后的样本数据集,此变体也会产生与您的查询相同的结果。
WITH
CTE_RN
AS
(
SELECT
[ValueId]
,[ListId]
,[ValueDelta]
,[ValueCreated]
,ROW_NUMBER() OVER (PARTITION BY ListID ORDER BY ValueCreated) AS rn1
,ROW_NUMBER() OVER (PARTITION BY ListID, ValueDelta ORDER BY ValueCreated) AS rn2
FROM [Value]
)
,CTE2
AS
(
SELECT
ValueId
,ListId
,ValueDelta
,ValueCreated
,rn1
,rn2
,rn1-rn2 AS Diff
,FIRST_VALUE(ValueID) OVER(
PARTITION BY ListID, ValueDelta, rn1-rn2 ORDER BY ValueCreated
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS FirstID
,LAST_VALUE(ValueID) OVER(
PARTITION BY ListID, ValueDelta, rn1-rn2 ORDER BY ValueCreated
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LastID
FROM CTE_RN
)
SELECT
ListID
,FirstID
,LastID
,MIN(ValueCreated) AS FirstCreated
,MAX(ValueCreated) AS LastCreated
,ValueDelta
,COUNT(*) AS ValueCount
FROM CTE2
GROUP BY
ListID
,ValueDelta
,rn1-rn2
,FirstID
,LastID
ORDER BY FirstCreated;