这是一篇较旧的帖子,但与我目前正在做的事情相关(2013 年)。如果您获得更大的数据集(在大多数数据库中很典型),则各种查询的性能(查看执行计划)说明了很多。首先我们创建一个“TALLY 表”来随机生成数字,然后使用任意公式为“MyTable”创建数据:
CREATE TABLE #myTable(
[id] [int] NOT NULL,
[business_key] [int] NOT NULL,
[result] [int] NOT NULL,
PRIMARY KEY (Id)
) ON [PRIMARY];
; WITH
-- Tally table Gen Tally Rows: X2 X3
t1 AS (SELECT 1 N UNION ALL SELECT 1 N), -- 4 , 8
t2 AS (SELECT 1 N FROM t1 x, t1 y), -- 16 , 64
t3 AS (SELECT 1 N FROM t2 x, t2 y), -- 256 , 4096
t4 AS (SELECT 1 N FROM t3 x, t3 y), -- 65536 , 16,777,216
t5 AS (SELECT 1 N FROM t4 x, t4 y), -- 4,294,967,296, A lot
Tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N
FROM t5 x, t5 y)
INSERT INTO #MyTable
SELECT N, CAST(N/RAND(N/8) AS bigINT)/5 , N%2
FROM Tally
WHERE N < 500000
接下来我们运行三种不同类型的查询来检查性能(如果您使用的是 SQL Server Management Studio,请打开“实际执行计划”):
SET STATISTICS IO ON
SET STATISTICS TIME ON
----- Try #1
select 'T1' AS Qry, id, business_key,
result
from #myTable
where id in
(select max(id)
from #myTable
group by business_key)
---- Try #2
select 'T2' AS Qry, id, business_key,
result
from
(select id,
business_key,
result,
max(id) over (partition by business_key) as max_id
from #mytable) x
where id = max_id
---- Try #3
;with cteRowNumber as (
select id,
business_key,
result,
row_number() over(partition by business_key order by id desc) as RowNum
from #mytable
)
SELECT 'T3' AS Qry, id, business_key,
result
FROM cteRowNumber
WHERE RowNum = 1
清理:
IF OBJECT_ID(N'TempDB..#myTable',N'U') IS NOT NULL
DROP TABLE #myTable;
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
您会发现,查看执行计划,“Try 1”具有最好的“查询成本”和最低的 CPU 时间,但“Try 3”的读取次数最少,CPU 时间也不算太差。我建议使用 CTE 方法来减少阅读次数