按特定值选择行答案

【问题标题】：Select rows by the specific value按特定值选择行
【发布时间】：2013-11-07 14:31:03
【问题描述】：

如何从数据框中选择特定列中值大于 1 的行？

这就是我的数据的样子：

> dput(head(tbl_comp[,-1]))
structure(list(Meve_mean = c(7774.44229552491, 43374.1166119026, 
585562.72426545, 3866.54724117546, 320338.197537275, 918368.01990607
), Mmor_mean = c(39113.5325249635, 119476.157216344, 1296530.34384725, 
23511.2980313616, 209092.538981888, 577355.581852083), Mtot_mean = c(23443.9874102442, 
81425.1369141232, 941046.53405635, 13688.9226362685, 264715.368259581, 
747861.800879077), tot_meanMe = c(258492586.999527, NA, NA, NA, 
NA, NA), tot_meanMm = c(246665241.110832, NA, NA, NA, NA, NA), 
    tot_sdMe = c(35569170.0311164, NA, NA, NA, NA, NA), tot_sdMm = c(30522099.9189256, 
    NA, NA, NA, NA, NA), Wteve_mean = c(10752.4381084666, 53658.8435672746, 
    715547.921685567, 3422.17220367207, 335384.199178456, 1013708.18845339
    ), Wtmor_mean = c(29254.6414790837, 98804.8007431987, 1001344.20496027, 
    11541.8862121394, 217110.411645861, 571826.157099177), Wttot_mean = c(18681.9538387311, 
    73007.110928385, 838032.04308901, 6902.04963587237, 284695.433093058, 
    824330.175015869), tot_meanwte = c(278901499.672313, NA, 
    NA, NA, NA, NA), tot_meanwtm = c(235415566.775308, NA, NA, 
    NA, NA, NA), tot_sdwte = c(16743477.4011497, NA, NA, NA, 
    NA, NA), tot_sdwtm = c(3922418.43271348, NA, NA, NA, NA, 
    NA), diff_eve = c(0.72303994843767, 0.808331185101342, 0.818341730189196, 
    1.12985174650959, 0.955138012828161, 0.905949098928778), 
    diff_mor = c(1.33700262752933, 1.20921408998001, 1.29478988086689, 
    2.03704122525771, 0.963070068343606, 1.00966976533735), diff_tot = c(1.25490018938172, 
    1.11530419268331, 1.12292428650774, 1.98331269093204, 0.929819510568173, 
    0.907235745512628)), .Names = c("Meve_mean", "Mmor_mean", 
"Mtot_mean", "tot_meanMe", "tot_meanMm", "tot_sdMe", "tot_sdMm", 
"Wteve_mean", "Wtmor_mean", "Wttot_mean", "tot_meanwte", "tot_meanwtm", 
"tot_sdwte", "tot_sdwtm", "diff_eve", "diff_mor", "diff_tot"), row.names = c(NA, 
6L), class = "data.frame")

在此数据中，我最感兴趣的三列：

diff_eve  diff_mor  diff_tot
0.7230399 1.3370026 1.2549002
0.8083312 1.2092141 1.1153042
0.8183417 1.2947899 1.1229243
1.1298517 2.0370412 1.9833127
0.9551380 0.9630701 0.9298195
0.9059491 1.0096698 0.9072357

我想创建由每一列中的值选择的新数据框。应该有 6 个新的数据帧。 “diff_eve”列中低于 1 的值应该在新数据帧中，高于 1 = 下一个数据帧的值也是如此。当然，我想保留数据中的所有列（tbl_comp）。

让我展示新数据框的示例。通过 diff_eve 列中低于 1 的值进行选择：

>newdata.frame
diff_eve  diff_mor  diff_tot  .. ....... .....  rest of the columns from tbl_comp.
0.7230399 1.3370026 1.2549002
0.8083312 1.2092141 1.1153042 
0.8183417 1.2947899 1.1229243
0.9551380 0.9630701 0.9298195
0.9059491 1.0096698 0.9072357

我希望你们中的一些人理解我想要实现的目标。

【问题讨论】：

您的六个数据集是diff_eve、diff_mor、diff_tot 列的各自值过滤为小于一和大于一吗？
抱歉回复晚了，但我不得不去实验室。

标签： r dataframe

【解决方案1】：

这是一个解决方案，它按以下顺序为您提供六个数据帧：

diff_eve <1
diff_mor <1
diff_tot <1
diff_eve >1
diff_mor >1
diff_tot >1

代码：

 cols <- c("diff_eve", "diff_mor", "diff_tot")
 c(lapply(cols, function(x)subset(tbl_comp, eval(parse(text=x))<1)), lapply(cols, function(x)subset(tbl_comp, eval(parse(text=x))>1)))

给你

[[1]]
   Meve_mean  Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve  diff_mor  diff_tot
1   7774.442   39113.53  23443.99  258492587  246665241 35569170 30522100   10752.44   29254.64   18681.95   278901500   235415567  16743477   3922418 0.7230399 1.3370026 1.2549002
2  43374.117  119476.16  81425.14         NA         NA       NA       NA   53658.84   98804.80   73007.11          NA          NA        NA        NA 0.8083312 1.2092141 1.1153042
3 585562.724 1296530.34 941046.53         NA         NA       NA       NA  715547.92 1001344.20  838032.04          NA          NA        NA        NA 0.8183417 1.2947899 1.1229243
5 320338.198  209092.54 264715.37         NA         NA       NA       NA  335384.20  217110.41  284695.43          NA          NA        NA        NA 0.9551380 0.9630701 0.9298195
6 918368.020  577355.58 747861.80         NA         NA       NA       NA 1013708.19  571826.16  824330.18          NA          NA        NA        NA 0.9059491 1.0096698 0.9072357

[[2]]
  Meve_mean Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm diff_eve  diff_mor  diff_tot
5  320338.2  209092.5  264715.4         NA         NA       NA       NA   335384.2   217110.4   284695.4          NA          NA        NA        NA 0.955138 0.9630701 0.9298195

[[3]]
  Meve_mean Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve  diff_mor  diff_tot
5  320338.2  209092.5  264715.4         NA         NA       NA       NA   335384.2   217110.4   284695.4          NA          NA        NA        NA 0.9551380 0.9630701 0.9298195
6  918368.0  577355.6  747861.8         NA         NA       NA       NA  1013708.2   571826.2   824330.2          NA          NA        NA        NA 0.9059491 1.0096698 0.9072357

[[4]]

  Meve_mean Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm diff_eve diff_mor diff_tot
4  3866.547   23511.3  13688.92         NA         NA       NA       NA   3422.172   11541.89    6902.05          NA          NA        NA        NA 1.129852 2.037041 1.983313

[[5]]
   Meve_mean  Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm  Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve diff_mor  diff_tot
1   7774.442   39113.53  23443.99  258492587  246665241 35569170 30522100   10752.438   29254.64   18681.95   278901500   235415567  16743477   3922418 0.7230399 1.337003 1.2549002
2  43374.117  119476.16  81425.14         NA         NA       NA       NA   53658.844   98804.80   73007.11          NA          NA        NA        NA 0.8083312 1.209214 1.1153042
3 585562.724 1296530.34 941046.53         NA         NA       NA       NA  715547.922 1001344.20  838032.04          NA          NA        NA        NA 0.8183417 1.294790 1.1229243
4   3866.547   23511.30  13688.92         NA         NA       NA       NA    3422.172   11541.89    6902.05          NA          NA        NA        NA 1.1298517 2.037041 1.9833127
6 918368.020  577355.58 747861.80         NA         NA       NA       NA 1013708.188  571826.16  824330.18          NA          NA        NA        NA 0.9059491 1.009670 0.9072357

[[6]]
   Meve_mean  Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve diff_mor diff_tot
1   7774.442   39113.53  23443.99  258492587  246665241 35569170 30522100  10752.438   29254.64   18681.95   278901500   235415567  16743477   3922418 0.7230399 1.337003 1.254900
2  43374.117  119476.16  81425.14         NA         NA       NA       NA  53658.844   98804.80   73007.11          NA          NA        NA        NA 0.8083312 1.209214 1.115304
3 585562.724 1296530.34 941046.53         NA         NA       NA       NA 715547.922 1001344.20  838032.04          NA          NA        NA        NA 0.8183417 1.294790 1.122924
4   3866.547   23511.30  13688.92         NA         NA       NA       NA   3422.172   11541.89    6902.05          NA          NA        NA        NA 1.1298517 2.037041 1.983313

如果您希望数据框前面的选择标准列，如您的示例所示，您可以使用：

c(lapply(cols, function(x)subset(tbl_comp[,c(cols, setdiff(colnames(tbl_comp), cols))], eval(parse(text=x))<1)), lapply(cols, function(x)subset(tbl_comp[,c(cols, setdiff(colnames(tbl_comp), cols))], eval(parse(text=x))>1)))

【讨论】：

我可以将它们分成不同的数据框吗？
lapply 给你一个列表。您可以将上述命令的输出存储在变量中，例如allframes，然后使用allframes[[1]] 等访问单个元素。
如果你想要一个数据帧，你可以像 DJack 的解决方案一样：subset(tbl_comp, diff_eve<1) 第一个数据帧等...

【解决方案2】：

我不知道我是否正确理解了您的问题，但这里可能是一个解决方案：

diff_eve_below <- subset(tbl_comp, diff_eve < 1)
diff_eve_above <- subset(tbl_comp, diff_eve > 1)
diff_mor_below <- subset(tbl_comp, diff_mor < 1)
diff_mor_above <- subset(tbl_comp, diff_mor > 1)
diff_tot_below <- subset(tbl_comp, diff_tot < 1)
diff_tot_above <- subset(tbl_comp, diff_tot > 1)

【讨论】：

当我尝试运行所有代码时出现错误“dput$diff_eve 中的错误：'closure' 类型的对象不是子集”
@Rechlay replace dput by tbl_comp （我的错误，我已经纠正了）但 user1981275 的答案要优雅得多；）