【问题标题】:Running a conditional through several dataframes stored in a list R通过存储在列表 R 中的多个数据帧运行条件
【发布时间】:2016-05-19 21:59:07
【问题描述】:

我有一个具有以下格式的数据框列表,我想通过它来运行条件:

IDn = c("ChrM", "ChrM" ,"ChrM" ,"ChrM" ,"ChrM")   
posn = c(2,5,7,8,9)
met = c(2,0,4,1,0)
nmet = c(2,1,0,2,0)
bd = c(3,3,0,8,10)
dfp = data.frame(IDn,posn,met,nmet,bd)

      IDn     posn met  nmet bd
    1 ChrM    2    2    2    3
    2 ChrM    5    0    1    3
    3 ChrM    7    4    0    0
    4 ChrM    8    1    2    8
    5 ChrM    9    0    0    10

dfp[crit] <- (dfp[met]+dfp[nmet]>=4) & (dfp[met]>=dfp[bd])

问题是列表中的每个 df 都有不同的名称,存储在 names2 下

names2[crit] <- as.numeric((names2[met]+names2[nmet]>=4) & (names2[met]>=names2[bd]))

[crit] 是存储 0 或 1 值的新列。我试图用lapply 运行它,但到目前为止还没有运气。有什么建议吗?

【问题讨论】:

    标签: r list dataframe conditional lapply


    【解决方案1】:

    不确定您的lapply-code 出了什么问题(将您尝试过的代码包含在您的问题中总是好的),但以下应该可以工作:

    # creating a list
    dflist <- list(d1=dfp, d2=dfp)
    
    # updating the dataframes in your list
    dflist <- lapply(dflist, function(x) {x$crit <- (x$met + x$nmet >= 4) & (x$met>=x$bd); x})
    
    # or:
    dflist <- lapply(dflist, function(x) {cbind(x, crit = (x$met + x$nmet >= 4) & (x$met>=x$bd))})
    

    结果如下:

    > dflist
    $d1
       IDn posn met nmet bd  crit
    1 ChrM    2   2    2  3 FALSE
    2 ChrM    5   0    1  3 FALSE
    3 ChrM    7   4    0  0  TRUE
    4 ChrM    8   1    2  8 FALSE
    5 ChrM    9   0    0 10 FALSE
    
    $d2
       IDn posn met nmet bd  crit
    1 ChrM    2   2    2  3 FALSE
    2 ChrM    5   0    1  3 FALSE
    3 ChrM    7   4    0  0  TRUE
    4 ChrM    8   1    2  8 FALSE
    5 ChrM    9   0    0 10 FALSE
    

    回应您的评论:

    当您使用 data.table 时,您还可以使用:

    dflist <- lapply(dflist, function(x) x[, crit := (met + nmet >= 4) & (met>=bd)])
    

    【讨论】:

    • 我从这里得到Error in x[, "met"] + x[, "nmet"] : non-numeric argument to binary operator &gt; &gt; is.numeric(met) [1] TRUE &gt; is.numeric(nmet) [1] TRUE。知道为什么会这样吗?谢谢!
    • @GabrielHernandez Strange,它正在处理您提供的示例数据。你能检查一下更新的代码吗?
    • @GabrielHernandez 可以在问题中包含您的一部分数据的dput 吗?例如:dput(dflist[[1]])dput(head(dflist[[1]],5))
    • dput(head(df[[1]],5)) structure(list(ID = c("ChrM", "ChrM", "ChrM", "ChrM", "ChrM"), pos = c(5L, 6L, 7L, 10L, 11L), ori = c("+", "+", "-", "+", "-"), cont = c("CCG", "CGT", "CGG", "CGA", "CGA"), met = c(0L, 2L, 0L, 2L, 2L), nmet = c(1L, 0L, 6L, 1L, 6L), bd = c(2L, 0L, 7L, 2L, 2L)), .Names = c("ID", "pos", "ori", "cont", "met", "nmet", "bd"), class = c("data.table", "data.frame"), row.names = c(NA, -5L), .internal.selfref = &lt;pointer: 0x0000000005840788&gt;)
    • @GabrielHernandez 它对我有用。我已经用 data.table 特定的代码更新了我的答案。
    【解决方案2】:

    我们可以在没有任何匿名函数的情况下使用transform

     lapply(dflist, transform, crit = (met + nmet)>=4 & (met >=bd))
    #  $d1
    #   IDn posn met nmet bd  crit
    #1 ChrM    2   2    2  3 FALSE
    #2 ChrM    5   0    1  3 FALSE
    #3 ChrM    7   4    0  0  TRUE
    #4 ChrM    8   1    2  8 FALSE
    #5 ChrM    9   0    0 10 FALSE
    
    #$d2
    #   IDn posn met nmet bd  crit
    #1 ChrM    2   2    2  3 FALSE
    #2 ChrM    5   0    1  3 FALSE
    #3 ChrM    7   4    0  0  TRUE
    #4 ChrM    8   1    2  8 FALSE
    #5 ChrM    9   0    0 10 FALSE
    

    使用dplyr/purrr 的另一个选项是

    library(dplyr)
    library(purrr)
    dflist %>%
            map(~mutate(., crit=(met+nmet)>=4 & (met >=bd)))
    #$d1
    #   IDn posn met nmet bd  crit
    #1 ChrM    2   2    2  3 FALSE
    #2 ChrM    5   0    1  3 FALSE
    #3 ChrM    7   4    0  0  TRUE
    #4 ChrM    8   1    2  8 FALSE
    #5 ChrM    9   0    0 10 FALSE
    
    #$d2
    #   IDn posn met nmet bd  crit
    #1 ChrM    2   2    2  3 FALSE
    #2 ChrM    5   0    1  3 FALSE
    #3 ChrM    7   4    0  0  TRUE
    #4 ChrM    8   1    2  8 FALSE
    #5 ChrM    9   0    0 10 FALSE
    

    数据

    dflist <- list(d1=dfp, d2=dfp)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-01-10
      • 2023-02-23
      相关资源
      最近更新 更多