【问题标题】:Flag observations that fulfill two conditions together标记同时满足两个条件的观察
【发布时间】:2018-08-09 01:52:46
【问题描述】:

我对 SAS 和 SQL 都很陌生,希望能得到帮助。

我有一个包含学生 ID、学期和审计类型的数据集。每个学期有 2 个 audit_types,学生可以出现在其中一个或两个都出现。

我需要为每个学期的每个学生 id 的这 3 个场景中的每一个创建一个标志:1)如果学生在 仅 audit_type_1,2)如果他/她在 仅 audit_type_2 和 3) 如果她/他在该期限内出现在 audit_type_1 和 audit_type_2 两者。不知道如何发布我的数据,但在这里

样本数据

| Id    | Term          | Audit_type    |
|----   |-------------  |-----------:   |
| 1     | Fall 2016     | 1             |
| 1     | Fall 2016     | 2             |
| 2     | Winter 2017   | 1             |
| 3     | Winter 2017   | 2             |
| 4     | Spring 2017   | 1             |
| 4     | Spring 2017   | 2             |

如下所示,我能够使用 case 为前 2 个场景创建一个标志:

proc sql;
create table test as
select id, term, audit_type,
case
when audit_type in ('audit_type_1') then 1
when audit_type in ('audit_type_2 ') then 2
end as audit_type_flag
from have;

我不知道如何标记第三种情况。所有帮助将不胜感激。提前感谢您的帮助和支持。所以,我想要下面的东西:

| Id    | Term          | Audit_type    | Flag  |
|----   |-------------  |-----------:   |------ |
| 1     | Fall 2016     | 1             | 3     |
| 1     | Fall 2016     | 2             | 3     |
| 2     | Winter 2017   | 1             | 1     |
| 3     | Winter 2017   | 2             | 2     |
| 4     | Spring 2017   | 1             | 3     |
| 4     | Spring 2017   | 2             | 3     |

【问题讨论】:

  • 你也需要分享你的代码和数据样本

标签: sql sas


【解决方案1】:

我在这里的字里行间阅读并假设您使用逻辑 Audit_Type = 1 然后 Flag = 1,Audit_Type = 2 然后 Flag = 2,如果两者都是 3 所以我只是将标志加在一起获得 3. 这可能不是您所追求的(您可能也需要 4、5、6 和 7 标志),这只是基于少量数据的假设并且不知道确切的用例,因此,我将提供一个解决方案,如果正确,请告诉我,我将添加 cmets 来解释语法。如果代码不是您想要的,我不想花时间解释代码。

问候, 斯科特

更新

我已在代码中添加了 cmets 以及指向页面的链接,这可能有助于您更好地理解我在说什么。

/* SETUP SOME DUMMY DATA */

   DATA HAVE;
        LENGTH ID 3. TERM $11. AUDIT_TYPE 3.; 
        INFILE DATALINES DSD DELIMITER = "," missover;
        INPUT ID TERM AUDIT_TYPE; 

    DATALINES;
    1,Fall 2016,1
    1,Fall 2016,2
    1,Summer 2016,1
    1,Summer 2016,2
    2,Winter 2017,1
    3,Winter 2017,2
    4,Spring 2017,1
    4,Spring 2017,2
    ;

    RUN;


/* PERFORM A SORT SO THAT WE CAN MAKE USE OF BY STATEMENT PROCESSING IN THE 
   SUBSEQUENT DATA STEP */

    PROC SORT DATA = HAVE;
        BY ID TERM;
    RUN;


    DATA WANT;

/* DOW LOOP */
/* THIS LOOP EXECUTES FOR EACH ROW UNTIL THE LAST.TERM IS ENCOUNTERED. */
/* COMMENTING OUT THE SECOND DOW LOOP WILL SHOW YOU THAT THIS LOOP IS 
   BASICALLY SUMMARISING THE RESULT OF EACH ID, TERM GROUP AFTER THE CONDITIONAL 
   LOGIC IS APPLIED TO THE FLAG VARIABLE.*/

        DO UNTIL (LAST.TERM);

/* EACH TIME THE DO LOOP EXECUTES A NEW ROW IS READ INTO THE PDV (PROGRAM DATA VECTOR).*/

            SET HAVE;

/* THE BY STATEMENT IS IN PLACE TO FACILITATE BY STATEMENT PROCESSING. */

            BY ID TERM;

/* INITIALISE THE FLAG VARIABLE EACH TIME A NEW TERM IS ENCOUNTERED */

            IF FIRST.TERM THEN FLAG = 0;

/* USING THE SYNTAX FLAG + 1 REPLICATES USING THE RETAIN STATEMENT WITH A SUM FUNCTION.
   IT IS REFERRED TO AS THE SUM STATEMENT.  IF YOU ARE INTERESTED IN LEARNING MORE ABOUT 
   THIS THEN SEE THE LINK TO THE DOCUMENTATION BELOW.*/

            IF AUDIT_TYPE = 1 THEN FLAG + 1;
            ELSE IF AUDIT_TYPE = 2 THEN FLAG + 2;

        END;

/* ONCE THE PREVIOUS DOW LOOP EXITS (BECAUSE THE LAST.TERM HAS BEEN REACHED) THE SECOND 
   DOW LOOP EXECUTES*/

        DO UNTIL (LAST.TERM);

/* AS PER DOW LOOP 1, EACH LOOP RESULTS IN A SINGLE OBSERVATION BEING READ */

            SET HAVE;

/* THE BY STATEMENT IS IN PLACE TO FACILITATE BY STATEMENT PROCESSING. */

            BY ID TERM;             

/* THE EXPLICIT OUTPUT STATEMENT EXECUTES TO OUTPUT THE VALUES CONTAINED WITHIN THE
   PDV */

            OUTPUT; 

        END;

    RUN;

进一步阅读:

The DOW-loop: a Smarter Approach to your Existing Code

The Sum Statement

The Power of the BY Statement

【讨论】:

  • 效果很好,非常感谢 Scott。我期待您的解释。
  • @PDevi cmets 并添加了几个链接来帮助您,
  • @Scott,感谢您的时间和耐心逐步解释一切。我一定会使用您提供的链接。
【解决方案2】:

只需使用额外的else 就可以解决问题

select id, term, audit_type,
case
when audit_type in ('audit_type_1') then 1
when audit_type in ('audit_type_2 ') then 2
else 3
end as audit_type_flag
from have;

【讨论】:

    【解决方案3】:

    我认为你需要聚合:

    proc sql;
    create table test as
        select id, term, audit_type,
               (case when min(audit_type) <> max(audit_type) then 1
                     when min(audit_type) = 2 then 2
                     when min(audit_type) = 3 then 3
                end) as audit_type_flag
        from have
        group by id, term;
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-01-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-04-30
      • 1970-01-01
      相关资源
      最近更新 更多