按日期范围对时态数据分组 SQL Server答案

【问题标题】：Temporal data grouping by date ranges SQL Server按日期范围对时态数据分组 SQL Server
【发布时间】：2016-04-07 16:26:21
【问题描述】：

CREATE TABLE [dbo].[rx](
            [pat_id] [int] NOT NULL,
            [fill_Date] [date] NOT NULL,
            [script_End_Date]  AS (dateadd(day,[dayssup],[filldate])),
            [drug_Name] [varchar](50) NULL,
            [days_Sup] [int] NOT NULL,
            [quantity] [float] NOT NULL,
            [drug_Class] [char](3) NOT  NULL,
            CHECK(fill_Date <=script_End_Date
PRIMARY KEY CLUSTERED 
(
            [clmid] ASC
)
CREATE TABLE [dbo].[Calendar](
            [cal_date] [date] NOT NULL,
            [julian_seq] [int] IDENTITY(1,1) NOT NULL,
--unsure if the above line is an acceptable way of adding a 'julianized date number', the data in this database ranges from 1-1-2007 to 12-31-2009
PRIMARY KEY CLUSTERED 
(
            [cal_date] ASC
)

我有我感兴趣的表和具有上述结构的日历表。我试图通过字段 drug_class 找到一个人在给定时间（重叠日期）在一定数量的家庭中服用的不同药物的最大数量。

在社区的帮助下，我在类似问题上取得了成功，但目前我正在做一些不正确的事情并且得到非常不准确的结果。如果可能的话，我希望返回的结果集看起来像

create table DesiredResults
(pat_id int, min_overlap date, max_overlap date, drug_class char(3),drug_name varchar(50))
insert into Desired_Results(patid, minoverlap, maxoverlap, drug_class,drug_name)
values (1111,'2008-11-28', '2008-12-18','h3a','drug X')
      ,(1111,'2008-11-28','2008-12-18','h3a','drug Y')

这意味着在上述时间范围内，患者 111 被开了药物 x 和药物 y。

我的查询是 -

;with Overlaps (pat_id,cal_date,drug_class)
as
(
select
mdo.pat_id
,c1.cal_date
,mdo.drug_class
from
(
--this gives a table of all the scripts a person had within the classes restricted in the where rx.drug_class IN clause and their fill_date and script_end_dates
SELECT DISTINCT
 rx.pat_id
,rx.drug_class
,rx.drug_name
,rx.fill_date
,rx.script_end_date
FROM   rx
WHERE  rx.drug_class IN( 'h3a', 'h6h', 'h4b', 'h2f', 'h2s', 'j7c', 'h2e' )
--
) as mdo,Calendar as c1
where c1.cal_date between mdo.fill_date and mdo.script_end_date
group by mdo.pat_id,c1.cal_date,mdo.drug_class
having count(*) > 1--overlaps
)
,
Groupings(pat_id,cal_date,drug_class,grp_nbr)
as
(
select
o.pat_id
,o.cal_date
,o.drug_class
,c2.julian_seq

--julianized date
-row_number() over(partition by o.pat_id,o.drug_class order by o.cal_date) as grp_nbr
from Overlaps as o,calendar as c2
where c2.cal_date = o.cal_date
)
,x

as
(

--i think this is what's causing the problem

select pat_id,min(cal_date) as min_overlap,max(cal_date) as max_overlap,drug_class
from groupings
group by pat_id,grp_nbr,drug_class

)

select 
 x.pat_id
,x.min_overlap
,x.max_overlap
,y.drug_class
,y.drug_name
from x
inner join
(
select distinct
 rx.pat_id
,rx.drug_name
,rx.drug_class
,rx.fill_date
from rx
) as y on x.pat_id = y.pat_id and x.drug_class=y.drug_class
          and y.fill_date between x.min_overlap and x.max_overlap
order by datediff(day,min_overlap,max_overlap) desc

我正在寻找给定类别中开具最多药物的天数。但是，现在这给了我比任何单个 datediff(day,fill_date,script_end_date) 更大的日期范围。

这会人为地夸大数字，因为某些重叠范围长达数年，而它们最多应该是医生编写脚本的天数。如果在同一天开出“h3a”类中的五种药物，那么我会捕捉到pat_id、fill_date、end_date、h3a 对该类中的每种药物重复五次的那段时间。

【问题讨论】：

标签： sql sql-server sql-server-2008 tsql

【解决方案1】：

我不确定这是否能解决您的问题。这给出了最大数量的药物处方日期：

select c.cal_date
from (select c.cal_date, count(*) as NumDrugs,
             dense_rank() over (order by count(*) desc) as seqnum
      from Calendar c join
           rx
           on c.cal_date between rx.fill_date betwen rx.script_end_date and
              rx in IN( 'h3a', 'h6h', 'h4b', 'h2f', 'h2s', 'j7c', 'h2e' )
      group by c.cal_date
    ) crx
where seqnum = 1

它比你的查询简单得多，所以我想知道我是否遗漏了什么。

如果需要把它变成句号，也是可以的，只是SQL比较麻烦。

此外，此 SQL 未经测试，因此可能存在语法错误。

【讨论】：