【问题标题】:Optimizing "starts with" comparisons in SQL Server在 SQL Server 中优化“开始于”比较
【发布时间】:2014-06-15 14:35:01
【问题描述】:

我有一个包含 450,000 行的表,其中有一个可变长度的 varchar 列(6 到 13 个字符之间,分布不均匀)。我需要使用标准连接到另一个表,即目标表中的列以第一个表的列的值开头。

在我当前的测试样本中,我知道所有匹配项都是 6 个字符,所以我使用 t1.Digits = left(t2.Number, 6) 进行连接,速度非常快(运行大型查询只需几秒钟)。我的测试样本是 10,000 条记录,但在生产中查询需要对数十万条记录进行操作。

我也知道绝大多数记录将始终是 6 个字符匹配,但我需要支持更多匹配,否则有时会返回重复记录。问题是我已经尝试了以下所有方法,并且每种方法都比我在左侧六个字符上的简单连接要慢得多。我从来没有让他们跑超过五分钟,但他们没有任何终止的迹象:

  1. t1.Digits = left(t2.Number, datalength(t1.Digits))
  2. charindex(t1.Digits, t2.Number) = 1
  3. 将预先计算的DigitLength int 列添加到t1,然后使用t1.Digits = left(t2.Number, t1.DigitLength)
  4. t2.Number like t1.Digits + '%'

上述四个解决方案中的每一个都在理论上实现了我想要的,但是对于我的目的来说运行速度太慢了。

即使这些列中的值是数字,我仍然使用varchar,因为在许多情况下需要保留前导零。无论如何,即使对于数据是字母数字的情况,也应该有一个快速的解决方案。

有没有人知道一个非常快速的“开始于”逻辑,在性能上可以与我过于简单的连接相媲美?

我在t1.Digits 列上有聚集索引吗?

这是使用上述方法 #4 运行的执行计划:

    <?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.0" Build="9.00.5000.00" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="10720" StatementId="1" StatementOptmLevel="FULL" StatementSubTreeCost="7471.7" StatementText="select c.FromNumber, c.ToNumber, d.Destination, d.Digits&#xD;&#xA;from Converting c&#xD;&#xA;--join CASH.CASH.dbo.DestinationLookup d on d.Digits = left(c.FromNumber, 6) &#xD;&#xA;join CASH.CASH.dbo.DestinationLookup d on c.FromNumber like d.Digits + '%' &#xD;&#xA;" StatementType="SELECT">
          <StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" />
          <QueryPlan DegreeOfParallelism="1" MemoryGrant="114" CachedPlanSize="99" CompileTime="36" CompileCPU="35" CompileMemory="312">
            <RelOp AvgRowSize="77" EstimateCPU="174.861" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10720" LogicalOp="Inner Join" NodeId="0" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="7471.7">
              <OutputList>
                <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
              </OutputList>
              <RunTimeInformation>
                <RunTimeCountersPerThread Thread="0" ActualRows="10720" ActualEndOfScans="1" ActualExecutions="1" />
              </RunTimeInformation>
              <NestedLoops Optimized="false">
                <OuterReferences>
                  <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                </OuterReferences>
                <RelOp AvgRowSize="38" EstimateCPU="0.164714" EstimateIO="0.00281532" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10720" LogicalOp="Sort" NodeId="1" Parallel="false" PhysicalOp="Sort" EstimatedTotalSubtreeCost="0.340338">
                  <OutputList>
                    <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                    <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                  </OutputList>
                  <MemoryFractions Input="1" Output="1" />
                  <RunTimeInformation>
                    <RunTimeCountersPerThread Thread="0" ActualRebinds="1" ActualRewinds="0" ActualRows="10720" ActualEndOfScans="1" ActualExecutions="1" />
                  </RunTimeInformation>
                  <Sort Distinct="false">
                    <OrderBy>
                      <OrderByColumn Ascending="true">
                        <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                      </OrderByColumn>
                    </OrderBy>
                    <RelOp AvgRowSize="38" EstimateCPU="0.00296763" EstimateIO="0.126907" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10720" LogicalOp="Table Scan" NodeId="2" Parallel="false" PhysicalOp="Table Scan" EstimatedTotalSubtreeCost="0.129875">
                      <OutputList>
                        <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                        <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                      </OutputList>
                      <RunTimeInformation>
                        <RunTimeCountersPerThread Thread="0" ActualRows="10720" ActualEndOfScans="1" ActualExecutions="1" />
                      </RunTimeInformation>
                      <TableScan Ordered="false" ForcedIndex="false" NoExpandHint="false">
                        <DefinedValues>
                          <DefinedValue>
                            <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                          </DefinedValue>
                          <DefinedValue>
                            <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                          </DefinedValue>
                        </DefinedValues>
                        <Object Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" />
                      </TableScan>
                    </RelOp>
                  </Sort>
                </RelOp>
                <RelOp AvgRowSize="48" EstimateCPU="0.00290986" EstimateIO="0.01" EstimateRebinds="1390" EstimateRewinds="9329" EstimateRows="15609.2" LogicalOp="Lazy Spool" NodeId="3" Parallel="false" PhysicalOp="Table Spool" EstimatedTotalSubtreeCost="7296.5">
                  <OutputList>
                    <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                    <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                  </OutputList>
                  <RunTimeInformation>
                    <RunTimeCountersPerThread Thread="0" ActualRebinds="1391" ActualRewinds="9329" ActualRows="10720" ActualEndOfScans="10720" ActualExecutions="10720" />
                  </RunTimeInformation>
                  <Spool>
                    <RelOp AvgRowSize="48" EstimateCPU="5.21308" EstimateIO="0" EstimateRebinds="1390" EstimateRewinds="0" EstimateRows="15609.2" LogicalOp="Compute Scalar" NodeId="4" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="7251.4">
                      <OutputList>
                        <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                        <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                      </OutputList>
                      <ComputeScalar>
                        <DefinedValues>
                          <DefinedValue>
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                            <ScalarOperator ScalarString="[CASH].[CASH].[dbo].[DestinationLookup].[Digits] as [d].[Digits]">
                              <Identifier>
                                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                              </Identifier>
                            </ScalarOperator>
                          </DefinedValue>
                          <DefinedValue>
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                            <ScalarOperator ScalarString="[CASH].[CASH].[dbo].[DestinationLookup].[Destination] as [d].[Destination]">
                              <Identifier>
                                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                              </Identifier>
                            </ScalarOperator>
                          </DefinedValue>
                        </DefinedValues>
                        <RelOp AvgRowSize="48" EstimateCPU="5.21308" EstimateIO="0" EstimateRebinds="1390" EstimateRewinds="0" EstimateRows="15609.2" LogicalOp="Remote Query" NodeId="5" Parallel="false" PhysicalOp="Remote Query" EstimatedTotalSubtreeCost="7251.4">
                          <OutputList>
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                          </OutputList>
                          <RunTimeInformation>
                            <RunTimeCountersPerThread Thread="0" ActualRebinds="1391" ActualRewinds="0" ActualRows="1391" ActualEndOfScans="1391" ActualExecutions="1391" />
                          </RunTimeInformation>
                          <RemoteQuery RemoteSource="CASH" RemoteQuery="SELECT &quot;Tbl1004&quot;.&quot;Digits&quot; &quot;Col1021&quot;,&quot;Tbl1004&quot;.&quot;Destination&quot; &quot;Col1022&quot; FROM &quot;CASH&quot;.&quot;dbo&quot;.&quot;DestinationLookup&quot; &quot;Tbl1004&quot; WHERE ? like &quot;Tbl1004&quot;.&quot;Digits&quot;+'%'" />
                        </RelOp>
                      </ComputeScalar>
                    </RelOp>
                  </Spool>
                </RelOp>
              </NestedLoops>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>

这是使用简单 left(t2.Number, 6) 加入时的计划:

    <?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.0" Build="9.00.5000.00" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="10720" StatementId="1" StatementOptmLevel="FULL" StatementSubTreeCost="15.1845" StatementText="select c.FromNumber, c.ToNumber, d.Destination, d.Digits&#xD;&#xA;from Converting c&#xD;&#xA;join CASH.CASH.dbo.DestinationLookup d on d.Digits = left(c.FromNumber, 6) " StatementType="SELECT">
          <StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" />
          <QueryPlan DegreeOfParallelism="1" CachedPlanSize="105" CompileTime="60" CompileCPU="58" CompileMemory="360">
            <RelOp AvgRowSize="77" EstimateCPU="0.0448096" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10720" LogicalOp="Inner Join" NodeId="0" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="15.1845">
              <OutputList>
                <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
              </OutputList>
              <RunTimeInformation>
                <RunTimeCountersPerThread Thread="0" ActualRows="10720" ActualEndOfScans="1" ActualExecutions="1" />
              </RunTimeInformation>
              <NestedLoops Optimized="false">
                <OuterReferences>
                  <ColumnReference Column="Expr1005" />
                </OuterReferences>
                <RelOp AvgRowSize="43" EstimateCPU="0.001072" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10720" LogicalOp="Compute Scalar" NodeId="1" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.13985">
                  <OutputList>
                    <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                    <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                    <ColumnReference Column="Expr1005" />
                  </OutputList>
                  <ComputeScalar>
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Column="Expr1005" />
                        <ScalarOperator ScalarString="substring([CASH].[dbo].[Converting].[FromNumber] as [c].[FromNumber],(1),(6))">
                          <Intrinsic FunctionName="substring">
                            <ScalarOperator>
                              <Identifier>
                                <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                              </Identifier>
                            </ScalarOperator>
                            <ScalarOperator>
                              <Const ConstValue="(1)" />
                            </ScalarOperator>
                            <ScalarOperator>
                              <Const ConstValue="(6)" />
                            </ScalarOperator>
                          </Intrinsic>
                        </ScalarOperator>
                      </DefinedValue>
                    </DefinedValues>
                    <RelOp AvgRowSize="38" EstimateCPU="0.011949" EstimateIO="0.126829" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10720" LogicalOp="Table Scan" NodeId="2" Parallel="false" PhysicalOp="Table Scan" EstimatedTotalSubtreeCost="0.138778">
                      <OutputList>
                        <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                        <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                      </OutputList>
                      <RunTimeInformation>
                        <RunTimeCountersPerThread Thread="0" ActualRows="10720" ActualEndOfScans="1" ActualExecutions="1" />
                      </RunTimeInformation>
                      <TableScan Ordered="false" ForcedIndex="false" NoExpandHint="false">
                        <DefinedValues>
                          <DefinedValue>
                            <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="FromNumber" />
                          </DefinedValue>
                          <DefinedValue>
                            <ColumnReference Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" Column="ToNumber" />
                          </DefinedValue>
                        </DefinedValues>
                        <Object Database="[CASH]" Schema="[dbo]" Table="[Converting]" Alias="[c]" />
                      </TableScan>
                    </RelOp>
                  </ComputeScalar>
                </RelOp>
                <RelOp AvgRowSize="48" EstimateCPU="0.000258212" EstimateIO="0.003125" EstimateRebinds="10580.9" EstimateRewinds="138.124" EstimateRows="1" LogicalOp="Lazy Spool" NodeId="6" Parallel="false" PhysicalOp="Index Spool" EstimatedTotalSubtreeCost="14.9998">
                  <OutputList>
                    <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                    <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                  </OutputList>
                  <RunTimeInformation>
                    <RunTimeCountersPerThread Thread="0" ActualRebinds="830" ActualRewinds="9890" ActualRows="10720" ActualEndOfScans="0" ActualExecutions="10720" />
                  </RunTimeInformation>
                  <Spool>
                    <SeekPredicate>
                      <Prefix ScanType="EQ">
                        <RangeColumns>
                          <ColumnReference Column="Expr1005" />
                        </RangeColumns>
                        <RangeExpressions>
                          <ScalarOperator ScalarString="[Expr1005]">
                            <Identifier>
                              <ColumnReference Column="Expr1005" />
                            </Identifier>
                          </ScalarOperator>
                        </RangeExpressions>
                      </Prefix>
                    </SeekPredicate>
                    <RelOp AvgRowSize="48" EstimateCPU="0.0103333" EstimateIO="0" EstimateRebinds="1180" EstimateRewinds="0" EstimateRows="1" LogicalOp="Compute Scalar" NodeId="7" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="12.2037">
                      <OutputList>
                        <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                        <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                      </OutputList>
                      <ComputeScalar>
                        <DefinedValues>
                          <DefinedValue>
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                            <ScalarOperator ScalarString="[CASH].[CASH].[dbo].[DestinationLookup].[Digits] as [d].[Digits]">
                              <Identifier>
                                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                              </Identifier>
                            </ScalarOperator>
                          </DefinedValue>
                          <DefinedValue>
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                            <ScalarOperator ScalarString="[CASH].[CASH].[dbo].[DestinationLookup].[Destination] as [d].[Destination]">
                              <Identifier>
                                <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                              </Identifier>
                            </ScalarOperator>
                          </DefinedValue>
                        </DefinedValues>
                        <RelOp AvgRowSize="48" EstimateCPU="0.0103333" EstimateIO="0" EstimateRebinds="1180" EstimateRewinds="0" EstimateRows="1" LogicalOp="Remote Query" NodeId="8" Parallel="false" PhysicalOp="Remote Query" EstimatedTotalSubtreeCost="12.2037">
                          <OutputList>
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Digits" />
                            <ColumnReference Server="[CASH]" Database="[CASH]" Schema="[dbo]" Table="[DestinationLookup]" Alias="[d]" Column="Destination" />
                          </OutputList>
                          <RunTimeInformation>
                            <RunTimeCountersPerThread Thread="0" ActualRebinds="456" ActualRewinds="0" ActualRows="456" ActualEndOfScans="0" ActualExecutions="456" />
                          </RunTimeInformation>
                          <RemoteQuery RemoteSource="CASH" RemoteQuery="SELECT &quot;Tbl1004&quot;.&quot;Digits&quot; &quot;Col1015&quot;,&quot;Tbl1004&quot;.&quot;Destination&quot; &quot;Col1016&quot; FROM &quot;CASH&quot;.&quot;dbo&quot;.&quot;DestinationLookup&quot; &quot;Tbl1004&quot; WHERE &quot;Tbl1004&quot;.&quot;Digits&quot;=?" />
                        </RelOp>
                      </ComputeScalar>
                    </RelOp>
                  </Spool>
                </RelOp>
              </NestedLoops>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>

更新:我一直无法找到理想的解决方案,但我发现了次佳的解决方案。似乎使用“like”对这两个表进行的非常简单的查询在大约五秒钟内完成。因此,我没有尝试将连接塞进我的怪物查询中,它永远不会完成,而是使用它来创建一个临时查找表,然后我的怪物查询使用它。总之,现在大查询在 9 秒内完成,并且我的 varchar 连接中支持可变长度字符串。

另一个有助于加快这一进程的因素是将 t1 中列的填充因子从 80 更改为 100。此填充因子非常适合该表,因为它是一个静态参考表,每年仅更改一次。

【问题讨论】:

  • LIKE 之类的函数是不可搜索的,这意味着普通索引将被忽略。我真的不认为有一个很好的方法来做你正在尝试的事情。为什么不能只在 t2 表中存储适当的值?
  • 谁告诉你的?只有前导通配符会杀死索引使用,而不是 LIKE 本身。例如,WHERE col LIKE 'x%' 将使用 col 上的索引(如果存在)。
  • 感谢您的建议。不幸的是,我的 t2 表没有静态内容。它用于处理新的传入记录,每月将有数百万条记录。将适当的值放入其中本质上就是我试图对这个连接做的事情。
  • @dean,感谢您证明“喜欢”是这四个选项中最快的。我仍然需要更快的东西,但无论如何我很高兴知道。
  • 然后向我们展示实际的执行计划,感谢支持:)

标签: sql sql-server join query-optimization


【解决方案1】:

这四个中性能最高的解决方案是第四个。

让我们设置测试环境:

create table #t1 (digits varchar(10), filler char(5000) default(''))
create table #t2 (number varchar(10), filler char(5000) default(''))
go

insert #t1 (digits) values
('123'),('234'),('345'),('456'),('567')

insert #t2 (number) values
('1234'),('234'),('345689'),('45'),('567890')
go

create index ix_t2 on #t2(number);
go

现在,让我们执行四个语义相同的查询,但启用 Query --> Include Actual Execution Plan 以及SET STATISTICS IO ON

-- 1
select *
from #t1
inner join #t2
on #t1.digits = left(#t2.number, datalength(#t1.digits))

-- 2
select *
from #t1
inner join #t2
on charindex(#t1.Digits, #t2.Number) = 1

-- 3
select *
from #t1
inner join #t2
on charindex(#t1.digits, #t2.number) = 1

-- 4
select *
from #t1
inner join #t2
on #t2.number like #t1.digits + '%'

如您所见,1、2 和 3 的执行计划包括两个表上的表扫描运算符(包括第一个表的附加计算标量运算符),但第四个查询在 #t2 上的索引上进行索引搜索.此外,如果您检查统计信息 io 的输出,您会看到 #t2(具有索引的表)上的逻辑读取测量值 1、2 和 3 为 25,但第四个仅为 14(当然,行数越多,数字越高)。

【讨论】:

    【解决方案2】:

    table1.digits 上建立索引。然后尝试以下操作:

    select t2.*, t1.<whatever>
    from table1 t2 cross apply
         (select top 1 <whatever>
          from table1 t1
          where t1.digits <= t2.number
          order by t1.digits desc
         ) t1;
    

    SQL Server 有时比常规连接更擅长优化“应用”查询。在这种情况下,它可能会发现该索引对whereorder by 都有用,并有效地进行。 (我也认为这同样适用于相关子查询。)

    【讨论】:

    • CROSS APPLY (CA) 和相关子查询 (CS) 完全不同。首先,您不能将 CS 用作查询源。 CA 导致对外部查询的每一行执行一次子查询,这可能对性能造成毁灭性影响,它实际上是另一个名称的游标。连接和 CA 的优化非常不同,因为它是两个语义不同的表达式。在游标比基于集合的解决方案更快的情况下,您所说的“SQL Server 有时比常规连接更擅长优化“应用”查询”是正确的。
    • @dean 。 . .我不确定你的意思是什么。在某些情况下,cross apply 的性能优于替代方案。这是一个示例:explainextended.com/2009/07/16/inner-join-vs-cross-apply。在这种情况下,本质上是索引扫描的每行执行一次可能比其他方法更有效。
    • 当然,在某些情况下游标比基于集合的解决方案更快,我不反对。但我的意思是,您将苹果和橙子与 CA 与 CS 进行比较。
    • @Dean 。 . .有CROSS APPLY/cursor 性能的比较(shannonlowder.com/2012/01/…)。您似乎对数据库有所了解,但游标和基于集合的操作在数据库中的实现方式非常不同。在某些情况下,它们可能会应用于类似的问题,但这并不意味着它们在做同样的事情。例如,SQL Server 可以将并行查询计划用于应用操作(在适当的情况下),但游标本身就引入了串行处理。
    • 我们在这里谈论 RBAR,一种或另一种方式。有没有办法让讨论远离 cmets?我还是个菜鸟 :)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-02-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多