我们最近遇到了一个案例,我们想做类似的事情,但要在几天内缓慢进行(每次运行只更新一定数量的记录,并且只在特定时间进行)。最新数据很好,但需要更新数百万行旧数据。我们的数据表如下所示:
Create Table FileContent
(
FileContent varchar(max),
File_PK bigint,
NewFileContent varchar(max)
)
我们只需要更新某些行,但需要更新数百万行。我们创建了一个表来存储我们的进度,这样我们就可以使用计划的作业来迭代和更新主表,然后用需要更新的主表记录的主键填充这个表:
Create Table FilesToUpdate
(
File_PK bigint,
IsUpdated bit NOT NULL DEFAULT 0
)
然后我们安排了以下脚本来进行更新(供您自己使用,请根据您的系统使用批量大小和安排)。
/***
Script to update and fix records.
***/
DECLARE @Rowcount INT = 1 --
, @BatchSize INT = 100 -- how many rows will be updated on each iteration of the loop
, @BatchesToRun INT = 25 -- the max number of times the loop will iterate
, @StartingRecord BIGINT = 1;
-- Get the highest File_PK not already fixed as a starting point.
Select @StartingRecord = MAX(File_PK) From FilesToUpdate where IsUpdated = 0
-- While there are still rows to update and we haven't hit our limit on iterations...
WHILE (@Rowcount > 0 and @BatchesToRun > 0)
BEGIN
print Concat('StartingRecord (Start of Loop): ', @StartingRecord)
UPDATE FileContent SET NewFileContent = 'New value here'
WHERE File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;
-- @@Rowcount is the number of records affected by the last statement. If this returns 0, the loop will stop because we've run out of things to update.
SET @Rowcount = @@ROWCOUNT;
print Concat('RowCount: ', @Rowcount)
-- Record which PKs were updated so we know where to start next time around.
UPDATE FilesToUpdate Set IsUpdated = 1 where File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;
-- The loop will stop after @BatchSize*@BatchesToRun records are updated.
-- If there aren't that many records left to update, the @Rowcount checks will stop it.
SELECT @BatchesToRun = @BatchesToRun - 1
print Concat('Batches Remaining: ',@BatchesToRun)
-- Set the starting record for the next time through the loop.
SELECT @StartingRecord -= @BatchSize
print Concat('StartingRecord (End of Loop): ', @StartingRecord)
END