【问题标题】:R: calculating margins or row & col sums for a data frameR:计算数据框的边距或行和列总和
【发布时间】:2011-05-02 23:09:19
【问题描述】:

我有一个如下所示的数据框:

         Flag1             Flag2    Type1 Type2  Type3
1        A                 FIRST      2    0       0
2        A                SECOND      1    9       0
3        A                 THIRD      3    7       0
4        A                FOURTH      9   18       0
5        A                 FIFTH      1   22       0
6        A                 SIXTH      1   13       0
7        B                 FIRST      0    0       0
8        B                SECOND      3    9       0
9        B                 THIRD      5   85       0
10       B                FOURTH      4   96       0
11       B                 FIFTH      3   40       0
12       B                 SIXTH      0   17       0

我需要以这样一种方式求和,使我的数据框最终看起来像这样

         Flag1             Flag2    Type1 Type2  Type3   Sum
1        A                 FIRST      2    0       0      2
2        A                SECOND      1    9       0     10 
3        A                 THIRD      3    7       0     10
4        A                FOURTH      9   18       0     27
5        A                 FIFTH      1   22       0     23
6        A                 SIXTH      1   13       0     14
7        B                 FIRST      0    0       0      0
8        B                SECOND      3    9       0     12
9        B                 THIRD      5   85       0     90
10       B                FOURTH      4   96       0    100
11       B                 FIFTH      3   40       0     43
12       B                 SIXTH      0   17       0     17 
13      (all)              FIRST      2    0       0      2
14      (all)             SECOND      4   18       0     22
15      (all)              THIRD      8   92       0    100
16      (all)             FOURTH     13  114       0    127
17      (all)              FIFTH      4   62       0     66
18      (all)              SIXTH      1   30       0     31
19       A                 (all)     17   68       0     86
20       B                 (all)     15  247       0    262
21      (all)              (all)     32  315       0    348

我已经尝试过 reshape2 包中的 add_margins 函数,但没有用,它不会像我想要的那样计算总和。我已经尝试过聚合、rowSums 和 colSums - 没有结果。

这里的任何帮助都会很棒。

谢谢

求和函数也需要加上前一个Flag2的和。喜欢,

        Flag1             Flag2    Type1 Type2  Type3   Sum
1        A                 FIRST      2    0       0      2
2        A                SECOND      1    9       0     12 
3        A                 THIRD      3    7       0     22
4        A                FOURTH      9   18       0     49
5        A                 FIFTH      1   22       0     72
6        A                 SIXTH      1   13       0     86
7        B                 FIRST      0    0       0      0
8        B                SECOND      3    9       0     12
9        B                 THIRD      5   85       0    102
10       B                FOURTH      4   96       0    202
11       B                 FIFTH      3   40       0    245
12       B                 SIXTH      0   17       0    262 
13      (all)              FIRST      2    0       0      2
14      (all)             SECOND      4   18       0     24
15      (all)              THIRD      8   92       0    124
16      (all)             FOURTH     13  114       0    251
17      (all)              FIFTH      4   62       0    317
18      (all)              SIXTH      1   30       0    348
19       A                 (all)     17   68       0     85
20       B                 (all)     15  247       0    262
21      (all)              (all)     32  315       0    347

【问题讨论】:

    标签: r dataframe margins


    【解决方案1】:

    假设你有这样一个data,frame,它的名字是dtable:

    dt1 <- as.data.frame(addmargins(xtabs(Type1~Flag1+Flag2, data=dtable)))
    dt2 <- as.data.frame(addmargins(xtabs(Type2~Flag1+Flag2, data=dtable)))
    dt3 <- as.data.frame(addmargins(xtabs(Type3~Flag1+Flag2, data=dtable)))
    names(dt1)[3] <- "Type1"
    names(dt2)[3] <- "Type2"
    names(dt3)[3] <- "Type3"
    
    dt.all <- merge(merge(dt1,dt2), dt3)
    dt.all$Sum <- with(dt.all, Type1+Type2+Type3)
    

    我无法获得您想要的确切排序顺序,但这很接近:

    levels(dt.all$Flag2) <-  c("FIRST", "SECOND", "THIRD", "FOURTH" ,"FIFTH", "SIXTH",  "Sum" ) 
    dt.all[order(dt.all$Flag1, dt.all$Flag2), ]
    
       Flag1  Flag2 Type1 Type2 Type3 Sum
    1      A  FIRST     1    22     0  23
    2      A SECOND     2     0     0   2
    3      A  THIRD     9    18     0  27
    4      A FOURTH     1     9     0  10
    5      A  FIFTH     1    13     0  14
    7      A  SIXTH     3     7     0  10
    6      A    Sum    17    69     0  86
    8      B  FIRST     3    40     0  43
    9      B SECOND     0     0     0   0
    10     B  THIRD     4    96     0 100
    11     B FOURTH     3     9     0  12
    12     B  FIFTH     0    17     0  17
    14     B  SIXTH     5    85     0  90
    13     B    Sum    15   247     0 262
    15   Sum  FIRST     4    62     0  66
    16   Sum SECOND     2     0     0   2
    17   Sum  THIRD    13   114     0 127
    18   Sum FOURTH     4    18     0  22
    19   Sum  FIFTH     1    30     0  31
    21   Sum  SIXTH     8    92     0 100
    20   Sum    Sum    32   316     0 348
    

    【讨论】:

    • 哇。那是一个完美的解决方案。非常感谢 DWin!
    • DWin,我用不同的总和函数编辑了我的原始问题,知道这是否可以做到吗?
    【解决方案2】:

    rowSums 为我工作(或者我错过了什么?)。

    > my.df <- read.table(textConnection("         Flag1             Flag2    Type1 Type2  Type3
    + 1        A                 FIRST      2    0       0
    + 2        A                SECOND      1    9       0
    + 3        A                 THIRD      3    7       0
    + 4        A                FOURTH      9   18       0
    + 5        A                 FIFTH      1   22       0
    + 6        A                 SIXTH      1   13       0
    + 7        B                 FIRST      0    0       0
    + 8        B                SECOND      3    9       0
    + 9        B                 THIRD      5   85       0
    + 10       B                FOURTH      4   96       0
    + 11       B                 FIFTH      3   40       0
    + 12       B                 SIXTH      0   17       0
    + "))
    Browse[2]> my.df
       Flag1  Flag2 Type1 Type2 Type3
    1      A  FIRST     2     0     0
    2      A SECOND     1     9     0
    3      A  THIRD     3     7     0
    4      A FOURTH     9    18     0
    5      A  FIFTH     1    22     0
    6      A  SIXTH     1    13     0
    7      B  FIRST     0     0     0
    8      B SECOND     3     9     0
    9      B  THIRD     5    85     0
    10     B FOURTH     4    96     0
    11     B  FIFTH     3    40     0
    12     B  SIXTH     0    17     0
    Browse[2]> rowSums(my.df[3:5])
      1   2   3   4   5   6   7   8   9  10  11  12 
      2  10  10  27  23  14   0  12  90 100  43  17 
    Browse[2]> my.df$Sum <- rowSums(my.df[3:5])
    Browse[2]> my.df
       Flag1  Flag2 Type1 Type2 Type3 Sum
    1      A  FIRST     2     0     0   2
    2      A SECOND     1     9     0  10
    3      A  THIRD     3     7     0  10
    4      A FOURTH     9    18     0  27
    5      A  FIFTH     1    22     0  23
    6      A  SIXTH     1    13     0  14
    7      B  FIRST     0     0     0   0
    8      B SECOND     3     9     0  12
    9      B  THIRD     5    85     0  90
    10     B FOURTH     4    96     0 100
    11     B  FIFTH     3    40     0  43
    12     B  SIXTH     0    17     0  17
    

    【讨论】:

    • 原始用户还想要一些(所有)跨类别聚合的行...
    猜你喜欢
    • 1970-01-01
    • 2023-02-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-11-11
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多