【问题标题】:Using CTE with paging to fetch records from 3 tables使用带有分页的 CTE 从 3 个表中获取记录
【发布时间】:2019-05-20 06:59:32
【问题描述】:

我有三个表TableA、TableB、TableC,而表A有几百万条记录。

表 A 有 AccountId,表 B 有 accountId,Client 和他们的证书,TableC 有证书。

情况是表B中的AccountId有多个具有多个证书的客户端。

当我尝试通过连接表 B 和 C 从表 A 中获取数据时,它会获取重复记录,因为表 B 中的 AccountId 有多个具有多个证书的客户端。

您可以使用此脚本填充表格和数据以测试情况

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[TableA]
(
    [AccountId] [int] NOT NULL,
    [Name] [nvarchar](50) NULL,
    [Mobile] [nchar](10) NULL,
 CONSTRAINT [PK_Accounts] PRIMARY KEY CLUSTERED 
(
    [AccountId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[TableB]
(
    [Id] [int] NOT NULL,
    [ClientId] [int] NOT NULL,
    [CertificateId] [int] NOT NULL,
    [AccountId] [int] NOT NULL,
 CONSTRAINT [PK_TableB] PRIMARY KEY CLUSTERED 
(
    [Id] ASC,
    [ClientId] ASC,
    [CertificateId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[TableC]
(
    [CertificateId] [int] NOT NULL,
    [Status] [bit] NOT NULL,
    [Description] [nvarchar](50) NULL,
 CONSTRAINT [PK_TableC] PRIMARY KEY CLUSTERED 
(
    [CertificateId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

INSERT [dbo].[TableA] ([AccountId], [Name], [Mobile]) VALUES (1, N'John', N'98        ')
INSERT [dbo].[TableA] ([AccountId], [Name], [Mobile]) VALUES (2, N'Henry', N'9808      ')
INSERT [dbo].[TableA] ([AccountId], [Name], [Mobile]) VALUES (3, N'Paine', N'9045      ')
INSERT [dbo].[TableA] ([AccountId], [Name], [Mobile]) VALUES (4, N'Andrew', N'887       ')
INSERT [dbo].[TableA] ([AccountId], [Name], [Mobile]) VALUES (5, N'Stocks', N'78        ')
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (1, 5, 34, 1)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (2, 8, 34, 1)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (3, 7, 36, 2)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (4, 9, 37, 3)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (5, 10, 37, 4)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (6, 4, 37, 4)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (7, 61, 37, 4)
INSERT [dbo].[TableB] ([Id], [ClientId], [CertificateId], [AccountId]) VALUES (8, 45, 35, 5)
INSERT [dbo].[TableC] ([CertificateId], [Status], [Description]) VALUES (34, 1, N'Certificate 1')
INSERT [dbo].[TableC] ([CertificateId], [Status], [Description]) VALUES (35, 1, N'Certificate 2')
INSERT [dbo].[TableC] ([CertificateId], [Status], [Description]) VALUES (36, 1, N'Certificate 3')
INSERT [dbo].[TableC] ([CertificateId], [Status], [Description]) VALUES (37, 0, N'Certificate 4')
ALTER TABLE [dbo].[TableB]  WITH CHECK ADD  CONSTRAINT [FK_TableB_TableA] FOREIGN KEY([AccountId])
REFERENCES [dbo].[TableA] ([AccountId])
GO
ALTER TABLE [dbo].[TableB] CHECK CONSTRAINT [FK_TableB_TableA]
GO
ALTER TABLE [dbo].[TableB]  WITH CHECK ADD  CONSTRAINT [FK_TableB_TableC] FOREIGN KEY([CertificateId])
REFERENCES [dbo].[TableC] ([CertificateId])
GO
ALTER TABLE [dbo].[TableB] CHECK CONSTRAINT [FK_TableB_TableC]
GO

我的查询

DECLARE @From int=1
DECLARE @To int=5
; WITH CTE_Data_WITH_PAGING  AS 
  (SELECT ROW_NUMBER() OVER( ORDER BY A.AccountId) AS [ROW_NUMBERS], 
        A.AccountId, A.Name, A.Mobile FROM TableA A
        LEFT JOIN  TableB B  
        ON B.AccountId = A.AccountId
        INNER JOIN TableC C
        ON C.CertificateId=B.CertificateId 
        AND C.CertificateId<>01

        )
        SELECT  * FROM CTE_Data_WITH_PAGING WHERE ROW_NUMBERS BETWEEN @From AND @To;

我尝试以这种方式使用 distinct 但分页问题。

SELECT DISTINCT(AccountId), Name, Mobile 
FROM CTE_Data_WITH_PAGING 
WHERE ROW_NUMBERS BETWEEN @From AND @To;

分页问题:尝试@from=1 @to=4 并查看输出。它应该得到AccountId: 1, 2, 3,4

【问题讨论】:

  • 我不清楚问题是什么。您是否希望结果中每个客户只有一行?
  • 问题是什么?您的预期结果如何?您可以发布示例数据吗?
  • 根据您提供的查询,不清楚为什么连接到表 B 和表 C 甚至是必要的。 TABLE A LEFT JOIN's to B,这意味着您将从 A 取回所有行,无论 B 或 C 中有什么。
  • @StuartAinsworth 是的,我只从表 A 中获取记录,我使用表 C 来检查证书状态,并使用表 B 与 certificateId 和 AccountId 进行映射
  • 那么 LEFT 联接应该是 INNER JOIN 吗?因为现在,表 B 和表 C 不影响输出(除了放大匹配)。如果表 B 中没有匹配项,您仍然会从表 A 中获取所有行

标签: sql-server tsql join common-table-expression


【解决方案1】:

一个快速的解决方法是使用dense_rank,结合distinct。这样可以确保每个帐户都被计算在内。还按照 cmets 中的建议使用内部连接:

DECLARE @From int=1
DECLARE @To int=5
; WITH CTE_Data_WITH_PAGING  AS 
  (SELECT distinct dense_rank() OVER( ORDER BY A.AccountId) AS [ROW_NUMBERS], 
    A.AccountId, A.Name, A.Mobile FROM TableA A
    inner JOIN  TableB B  
    ON B.AccountId = A.AccountId
    INNER JOIN TableC C
    ON C.CertificateId=B.CertificateId 
    AND C.CertificateId<>01

    )
    SELECT  * FROM CTE_Data_WITH_PAGING WHERE ROW_NUMBERS BETWEEN @From AND @To;

我倾向于使用 group by,而不是 distinct - 尤其是与分析函数结合使用时,所以另一种方法 - 使用 row_number 可能是:

DECLARE @From int=1
DECLARE @To int=5
; WITH CTE_Data_WITH_PAGING  AS 
  (SELECT  row_number() OVER( ORDER BY A.AccountId) AS [ROW_NUMBERS], 
    A.AccountId, A.Name, A.Mobile FROM TableA A
    inner JOIN  TableB B  
    ON B.AccountId = A.AccountId
    INNER JOIN TableC C
    ON C.CertificateId=B.CertificateId 
    AND C.CertificateId<>01
    group by  A.AccountId, A.Name, A.Mobile
    )
    SELECT  * FROM CTE_Data_WITH_PAGING WHERE ROW_NUMBERS BETWEEN @From AND @To;

我在这里分组以获得唯一值。

最后,如果你只是在其中进行分页,并且不需要行号,则只需使用 OFFSET 和 FETCH,如下所示:

声明@From int=1 声明@To int=5

 SELECT  
  A.AccountId, A.Name, A.Mobile FROM TableA A
  inner JOIN  TableB B  
  ON B.AccountId = A.AccountId
  INNER JOIN TableC C
  ON C.CertificateId=B.CertificateId 
  AND C.CertificateId<>01
 group by  A.AccountId, A.Name, A.Mobile
 order by  A.AccountId
 OFFSET (@FROM-1) ROWS FETCH NEXT (@TO-(@FROM-1)) ROWS ONLY;

【讨论】:

  • 我将您的解决方案 row_number() 与 group by 和 dense_rank 一起使用。但是在处理大型记录时两者都存在性能问题。你能建议我如何处理这个问题吗?
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-05-05
  • 2021-08-31
  • 2012-12-16
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多