【问题标题】:How to select columns with dplyr/tidyvese depending on minimal value in column in R如何根据 R 中列中的最小值选择具有 dplyr/tidyverse 的列
【发布时间】:2018-10-24 16:13:54
【问题描述】:

我有一个每个点的 Landcoverpixel 数量的数据集。

    species_distr <- data.frame(structure(list(Point = c(101, 102, 103, 104, 105, 106), `Herbaceous cover` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Tree or shrub cover` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Cropland, irrigated or post-flooding` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Mosaic cropland (>50%) / natural vegetation (tree, shrub, herbaceous cover) (<50%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Mosaic natural vegetation (tree, shrub, herbaceous cover) (>50%) / cropland (<50%)` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Tree cover, broadleaved, evergreen, closed to open (>15%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Tree cover, broadleaved, deciduous, closed to open (>15%)` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Tree cover, broadleaved, deciduous, closed (>40%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Tree cover, broadleaved, deciduous, open (15-40%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Tree cover, needleleaved, evergreen, closed to open (>15%)` = c(NA, 
NA, 1.73725490196078, NA, NA, NA), `Tree cover, needleleaved, evergreen, closed (>40%)` = c(NA, 
NA, 0L, NA, NA, NA), `Tree cover, needleleaved, evergreen, open (15-40%)` = c(NA, 
NA, 0L, NA, NA, NA), `Tree cover, needleleaved, deciduous, closed to open (>15%)` = c(2059.57647058824, 
544, 2209.63529411765, 1226.7568627451, 1722.34901960784, 1359.10196078432
), `Tree cover, needleleaved, deciduous, closed (>40%)` = c(NA, 
NA, 0L, 0L, NA, NA), `Tree cover, needleleaved, deciduous, open (15-40%)` = c(NA, 
NA, 0L, 0L, NA, NA), `Tree cover, mixed leaf type (broadleaved and needleleaved)` = c(NA, 
NA, 1.96470588235294, 0, NA, NA), `Mosaic tree and shrub (>50%) / herbaceous cover (<50%)` = c(NA, 
NA, 0, 2, NA, NA), `Mosaic herbaceous cover (>50%) / tree and shrub (<50%)` = c(NA, 
NA, 0L, NA, NA, NA), Shrubland = c(NA, NA, 0, NA, NA, NA), `Shrubland evergreen` = c(NA, 
NA, 0L, NA, NA, NA), `Shrubland deciduous` = c(NA, NA, 0, NA, 
NA, NA), Grassland = c(NA, NA, 0L, NA, NA, NA), `Lichens and mosses` = c(NA, 
NA, 0L, NA, NA, NA), `Sparse vegetation (tree, shrub, herbaceous cover) (<15%)` = c(NA, 
NA, 0, NA, NA, NA), `Sparse tree (<15%)` = c(NA, NA, 0L, NA, 
NA, NA), `Sparse shrub (<15%)` = c(NA, NA, 0L, NA, NA, NA), `Sparse herbaceous cover (<15%)` = c(NA, 
NA, 0L, NA, NA, NA), `Tree cover, flooded, fresh or brakish water` = c(NA, 
NA, 0, NA, NA, NA), `Tree cover, flooded, saline water` = c(NA, 
NA, 0L, NA, NA, NA), `Shrub or herbaceous cover, flooded, fresh/saline/brakish water` = c(NA, 
NA, 0, NA, NA, NA), `Urban areas` = c(NA, NA, 0L, NA, NA, NA), 
    `Bare areas` = c(NA, NA, 0, NA, NA, NA), `Consolidated bare areas` = c(NA, 
    NA, 0L, NA, NA, NA), `Unconsolidated bare areas` = c(NA, 
    NA, 0L, NA, NA, NA), `Water bodies` = c(NA, NA, 4.73725490196078, 
    NA, NA, NA)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame")))

如果要排除所有值不超过例如 50 的列。 我快速而肮脏的解决方案是这样的:

c <- NULL
for (i in 2:length(species_distr)) {
  if (max(na.omit(species_distr[,i])) > 50) {
    c <- c(c, i)
  }
}
species_distr_plot <- species_distr[,c(1,c)]

如何使用 dplyr/tidyverse 实现这一目标?到目前为止我试过了:

  %>%
select_if(na.omit(max(.)) > 50)

【问题讨论】:

    标签: r dplyr data-science


    【解决方案1】:

    我们可能需要any

    library(dplyr)
    species_distr %>% 
         select_if(~ !any(na.omit(max(.x)) > 50))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2023-02-04
      • 2021-04-14
      • 2023-03-23
      • 1970-01-01
      • 1970-01-01
      • 2020-11-17
      • 1970-01-01
      相关资源
      最近更新 更多