【问题标题】:how to join two or more tables into one in r using sql [duplicate]如何使用sql在r中将两个或多个表合并为一个[重复]
【发布时间】:2013-08-17 11:00:18
【问题描述】:

我已经编写了以下代码,我想将这些表连接成一个大表;那么如何使用 SQL 在 R 中做到这一点

user_lessthan10per  <- sqldf("select count(uid) as count_of_students
                       from adopted_user_point
                        where points_scored between 0 and (1469*0.1)")

接下来是

user_lessthan20per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.1) and points_scored <= (1469*0.2)")

,

user_lessthan30per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.2) and points_scored <= (1469*0.3)")

现在我想将它加入一个包含这三个表的 count_of_students 列的表中。

如何在 R 我有 UNION 命令但它显示错误。

【问题讨论】:

    标签: sql r join


    【解决方案1】:

    您可以使用条件聚合。这将返回一行三列:

    select sum(case when points_scored between 0 and (1469*0.1) then 1 else 0
               end) as cnt1,
           sum(case when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 1 else 0 
               end) as cnt2,
           sum(case when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 1 else 0
               end) as cnt3
    from adopted_user_point;
    

    如果你想要三行,你可以使用 group by 的聚合:

    select (case when points_scored between 0 and (1469*0.1) then 'Group1'
                 when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
                 when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
                 else 'Other'
            end) as cnt3, count(*) as count_of_students
    from adopted_user_point
    group by (case when points_scored between 0 and (1469*0.1) then 'Group1'
                   when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
                   when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
                   else 'Other'
              end);
    

    【讨论】:

      【解决方案2】:

      我会以不同的方式命名原始选择,也许是“u_0_10”、“u_10_20”、“u_20_30”,以明确“user_less than30per”实际上是“user_btwn20_30”,但现在它们是全局环境中的 R 数据帧,你真的不需要sdldf 把它们放在一起:

      user_under30per <- rbind(user_lessthan10per.
                              user_lessthan20per,
                              user_lessthan30per)
      

      sqldf 函数确实提供了 UNION:

       one_and_two <- sqldf("select * from lessthan10per union all 
                                             select * from lessthan20per")
       all_three <- sqldf("select * from one_and_two union all 
                                             select * from lessthan30per")
      

      【讨论】:

        猜你喜欢
        • 2021-11-24
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2015-05-27
        • 1970-01-01
        • 2015-09-08
        • 2016-11-28
        相关资源
        最近更新 更多