【问题标题】:How to optimize this query for inserting/updating millions of records in SQL Server如何优化此查询以在 SQL Server 中插入/更新数百万条记录
【发布时间】:2017-01-11 21:15:57
【问题描述】:

我在 SQL Server 中有 4 个表:AspNetUsersCustomerFileCustomerOptionLastPullRecords。应用程序从 Excel 文件上传客户记录。将此 Excel 文件转换为 DataTable,然后为 DataTable 的每一行调用此存储过程。

CustomerFile 表上应用了一个触发器。在这个存储过程中,首先我们检查FirstNameLastNameStreetAddressCityStateZip 是否未更改,然后仅更新官员详细信息,否则更新所有详细信息,将操作设置为更新('U'),这将在第二天将记录发送给第三方。其次,如果该客户不存在,则添加它并将操作设置为添加 ('A')。之后,如果可用,我们会根据客户记录更新另外两个表。

ALTER PROC [dbo].[InsertUpdateRecords]
(
    @FullName NVARCHAR(50) =NULL,
    @FirstName NVARCHAR(50) =NULL,
    @LastName NVARCHAR(50) =NULL,
    @StreetAddress NVARCHAR(50) =NULL,
    @City NVARCHAR(50) =NULL,
    @State NVARCHAR(50) =NULL,
    @Zip INT =NULL,
    @SSN NVARCHAR(50) =NULL,
    @Email NVARCHAR(150) =NULL,
    @OfficerEmail NVARCHAR(50) =NULL,
    @OfficerId NVARCHAR(50)=NULL,
    @OfficerName NVARCHAR(50) =NULL,
    @Option NVARCHAR(50) =NULL,
    @DownloadedFromFTP BIT =NULL,
    @LastPullDate DATETIME=NULL
)
AS
BEGIN
DECLARE @IsActive BIT
DECLARE @FileID INT
DECLARE @CompanyId INT

SET @IsActive=1

--Get Company ID based on OfficerID
Select @CompanyId=CompanyId from AspNetUsers where Email=@OfficerEmail

select top (1) @FileID=cf.fileId from CustomerFile cf inner join AspNetUsers usr on usr.Id=cf.OfficerId  where cf.SSN = @SSN and usr.CompanyId=@CompanyId order by cf.FileReceivedDate, cf.FileId desc

if ((@FileID<>'') or(@FileID is not null))
    Begin
        -- COMPARE IF ONLY OFFICER IS CHANGED
        If EXISTS(select 1 from CustomerFile where FirstName=@FirstName and LastName=@LastName and StreetAddress=@StreetAddress and City=@City and State=@State and Zip=@Zip
         and FileId=@FileID         
            )
            BEGIN
                UPDATE top (1) CustomerFile SET OfficerEmail=@OfficerEmail,
                OfficerName=@OfficerName,Email=@Email,
                ----FileModifiedDate=GETDATE(),
                DownloadedFromFTP=@DownloadedFromFTP,IsActive=@IsActive,OfficerId=@OfficerId
                WHERE FileId=@FileID                
            END
        Else
            BEGIN
                Update top (1) CustomerFile set FullName=@FullName, FirstName=@FirstName, LastName=@LastName, StreetAddress=@StreetAddress, City=@City,State=@State,Zip=@Zip,
                OfficerEmail=@OfficerEmail,OfficerName=@OfficerName,Email=@Email,
                --FileReceivedDate=GETDATE(),
                FileModifiedDate=GETDATE(),DownloadedFromFTP=@DownloadedFromFTP,IsActive=@IsActive,Action='U'
                where FileId=@FileID
            END
    End
Else
    BEGIN
        declare @IdentityOutput table ( ID int )

        INSERT INTO CustomerFile(FullName,FirstName,LastName,StreetAddress,City,State,Zip,SSN,OfficerEmail,OfficerId,OfficerName,
        FileReceivedDate,DownloadedFromFTP,IsActive,Action,Email
        ) 
        output inserted.FileId into @IdentityOutput
        VALUES(@FullName,@FirstName,@LastName,@StreetAddress,@City,
        @State,@Zip,@SSN,@OfficerEmail,@OfficerId,@OfficerName,
        GETDATE(),@DownloadedFromFTP,@IsActive,'A',@Email)

        select @FileID = (select ID from @IdentityOutput)
    END
---------------------------------------------------------------------------
-- Set Option
---------------------------------------------------------------------------


if ((@Option<>'') or(@Option is not null))
Begin
    if exists(select 1 from CustomerOption where CustomerFileID=@FileID)
        Begin
            Update CustomerOption Set Option=@Option where CustomerFileID=@FileID
        End
    else
        Begin
            Insert into CustomerOption (CustomerFileID, Option) values (@FileID, @Option)
        End
End

---------------------------------------------------------------------------
-- Insert Last Pull if exist
---------------------------------------------------------------------------

if ((@LastPullDate<>'') or(@LastPullDate is not null) or CONVERT(varchar(10),@LastPullDate,101)!='01/01/1900')

Begin

if((@FileID<>'') OR (@FileID<>0))
Begin
if exists (Select * from LastPullRecords where CustomerId=@FileID and CompanyId=@CompanyId)
bEGIN
Update LastPullRecords
set  LastPullDate=@LastPullDate,
     IsSelfPull=1,
     ModifiedDateTime=getdate()
where CustomerId=@FileID and CompanyId=@CompanyId
End

ELSE

Begin
iNSERT INTO LastPullRecords
(CompanyId,CustomerId,LastPullDate,IsSelfPull,IsRTS,CreatedDateTime)
values
(@CompanyId,@FileID,@LastPullDate,1,0,getdate())

End
End
end

END

问题是可能有数千条记录,并且此查询将需要很长时间才能上传所有这些记录。为了测试,我只上传了 10K 条记录,花了 13 分钟。

我尝试将 DataTable 作为参数发送,为数据表定义自定义用户表类型,使用 while 循环,然后使用 cursor,但所有这些实验都没有任何区别。

请建议任何优化的方式来上传这些记录,以减少时间。

【问题讨论】:

  • 其实我没有,是错字。让我编辑一下。
  • and then for each row of the DataTable this stored procedure is called. 逐行处理很慢,使用基于集合的方法(从 excel 中读取所有数据并将它们作为一批处理)。欲了解更多信息,请阅读RBAR: ‘Row By Agonizing Row’

标签: sql-server excel stored-procedures datatable


【解决方案1】:

如果您想继续使用 .NET 来完成此任务,我认为您可以从两个选项中进行选择:

  • 使用SqlBulkCopy将数据复制到中间表中(可能你可以写不带中间表的insert),然后写一个命令插入所有新数据,第二个命令更新所有更改的数据。
  • 使用Table-Valued Parameters 代替SqlBulkCopy 和中间表(如果您的MSSQL 不低于2012)

注意,对于小数据集,SqlBulkCopy 可能比表值参数慢,但对于大数据集更快。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-12-31
    • 1970-01-01
    • 2012-05-30
    • 2012-05-16
    • 1970-01-01
    • 2016-04-12
    • 2015-07-14
    • 2017-06-30
    相关资源
    最近更新 更多