【问题标题】:Efficient way to query for the existence of data in a partitioned table查询分区表中是否存在数据的有效方法
【发布时间】:2017-04-24 07:57:27
【问题描述】:

我在 Oracle 11G 企业版 11.2.0.4.0

我有一个表,每个分区有大约 12M 行。分区是SnapshotDate

我需要评估最近 15 天的快照是否有任何数据。

网上找到最多的答案告诉我使用Row_Number() Over (Partition By SnapshotDate Order By SnapshotDate)`。这是我想出的代码(它只返回到目前为止有值的日期,所以我当然需要与我的日历表进行左连接):

;With OneDateAllDates As
(
  /* 
      partition by snapshot date so that numbering starts over again
      i have to use an order by - it gave me an error without one
  */
  Select SnapshotDate, 1 HasData, Row_Number() Over (Partition By SnapshotDate Order By SnapshotDate) RowNumber
  From FactTable
  Where SnapshotDate IN
  (
    /* any mechanism that gives me the last 10 calendar days will do*/
    Select CalendarTimeId
    From DimCalendar
    Where CalendarDate Between To_Date ('20161208', 'yyyymmdd') - 15 And To_Date ('20161208', 'yyyymmdd')
  )
)
Select *
From AllDates
Where RowNumber = 1;

但是,在 15 天内订购 1200 万行非常昂贵 - 我正在对 1.8 亿行进行排序以获取 15 行。这是我想要的输出:

Date          HasData
===========   =======
12/08/2016    1
12/07/2016    1
12/06/2016    0
12/05/2016    0
12/04/2016    1
12/03/2016    0
12/02/2016    1
12/01/2016    0    
etc etc

有没有更有效的方法来编写这样的查询?

【问题讨论】:

  • 分区间隔是多少?
  • 每日分区,每天12M行

标签: sql performance plsql oracle11g partitioning


【解决方案1】:

我认为没有一种干净的方法可以将分区修剪和 Top N 报告结合起来。下面的代码丑陋且重复,但它可以快速完成工作。

它会读取最近 15 个每日分区中的每一个,但 rownum = 1 可以快速读取。日期可以用绑定变量替换,但数字 0 到 15 必须是硬编码的。如果您需要可变天数,您可以硬编码数十或数百个子查询,然后稍后使用另一个绑定变量将它们过滤掉。运行数百个子查询并不理想,但它仍然比读取 1.8 亿行要快得多。

查询

select CalendarDate, nvl(has_data, 0) has_data
from
(
    --The last 15 days.
    Select CalendarDate
    From DimCalendar
    Where CalendarDate Between To_Date ('20161208', 'yyyymmdd') - 15 And To_Date ('20161208', 'yyyymmdd')
) last_15_days
left join
(
    --The last 15 days of data, if any.
    select date '2016-12-08' - 0 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 0) and rownum = 1 union all
    select date '2016-12-08' - 1 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 1) and rownum = 1 union all
    select date '2016-12-08' - 2 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 2) and rownum = 1 union all
    select date '2016-12-08' - 3 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 3) and rownum = 1 union all
    select date '2016-12-08' - 4 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 4) and rownum = 1 union all
    select date '2016-12-08' - 5 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 5) and rownum = 1 union all
    select date '2016-12-08' - 6 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 6) and rownum = 1 union all
    select date '2016-12-08' - 7 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 7) and rownum = 1 union all
    select date '2016-12-08' - 8 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 8) and rownum = 1 union all
    select date '2016-12-08' - 9 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 9) and rownum = 1 union all
    select date '2016-12-08' - 10 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 10) and rownum = 1 union all
    select date '2016-12-08' - 11 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 11) and rownum = 1 union all
    select date '2016-12-08' - 12 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 12) and rownum = 1 union all
    select date '2016-12-08' - 13 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 13) and rownum = 1 union all
    select date '2016-12-08' - 14 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 14) and rownum = 1 union all
    select date '2016-12-08' - 15 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 15) and rownum = 1
) data_from_last_15_days
    on last_15_days.CalendarDate = data_from_last_15_days.the_date
order by CalendarDate desc;

测试架构

create table FactTable
(
    id number,
    SnapshotDate date
) nologging
partition by range (SnapshotDate)
interval (interval '1' day)
(
    partition p1 values less than (date '2000-01-01')
);

create table DimCalendar
(
    CalendarDate date
);

--Add last year into calendar.
insert into DimCalendar
select date '2016-01-01' + (level - 1)
from dual
connect by level <= 365;


--Insert 1.2 million rows per day.
begin
    for i in 1 .. 15 loop
        insert /*+ append */ into facttable select level, date '2016-12-01' + i from dual connect by level <= 1200000;
        commit;
    end loop;
end;
/

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-07-22
    • 2012-02-20
    相关资源
    最近更新 更多