【问题标题】:swap values between columns using R使用 R 在列之间交换值
【发布时间】:2019-02-28 13:34:38
【问题描述】:

我有一个名为 cloud 的数据集,如下所示:

"Rainfall, Treatment
274.7, Seeded
274.7, Seeded
Seeded, 255
242.5, Seeded
200.7, Seeded
198.6, Seeded
129.6, Seeded
119, Seeded
118.3, Seeded
115.3, Seeded
92.4, Seeded
40.6, Seeded
32.7, Seeded
31.4, Seded
17.5, Seeded"

谁能帮我:

  1. 在值错位的地方交换数据(即Rainfall == "Seeded"Treatment == 255应该交换);和

  2. 更正Treatment == "Seded"中值的拼写为"Seeded"

【问题讨论】:

  • 阅读 R 中的子集数据框。对于您的情况,如果行号是固定的,您可以使用 cloud[38, "Rainfall"] <- 255cloud[38, "Treatment"] <- "Seeded"。同样,您也可以更改第 49 行的拼写。
  • 谢谢@RonakShah,很有帮助。

标签: r normalization


【解决方案1】:

概述

我将错位的值存储在两个单独的向量中。然后在dplyr::mutate() 内部使用三个dplyr::if_else() 调用来根据需要清理变量。

# load necessary packages -----
library(tidyverse)

# load necessary data --------
cloud <-
  read_csv("Rainfall, Treatment
274.7, Seeded
           274.7, Seeded
           Seeded, 255
           242.5, Seeded
           200.7, Seeded
           198.6, Seeded
           129.6, Seeded
           119, Seeded
           118.3, Seeded
           115.3, Seeded
           92.4, Seeded
           40.6, Seeded
           32.7, Seeded
           31.4, Seded
           17.5, Seeded")

# store the misplaced text value
misplaced.text <-
  cloud %>% pull(Rainfall) %>% str_subset("^\\D.*$")

# store the misplaced numeric value
misplaced.numeric <-
  cloud %>% pull(Treatment) %>% str_subset("^\\d.*$")

# update cloud so that misplaced values are swapped -----
# and clean Treatment for mispellings
cloud.clean <-
  cloud %>%
  mutate(Rainfall = if_else(Rainfall %in% misplaced.text &
                              Treatment %in%  misplaced.numeric
                            , misplaced.numeric
                            , Rainfall) %>% as.double()
         , Treatment = if_else(Treatment %in%  misplaced.numeric
                               , misplaced.text
                               , Treatment)
         , Treatment = if_else(Treatment %in% "Seded"
                               , "Seeded"
                               , Treatment))

# view results ----
# note: tibble is only rounding the printed output in console
cloud.clean$Rainfall[1] # [1] 274.7
cloud.clean
# A tibble: 15 x 2
# Rainfall Treatment
#        <dbl> <chr>    
#  1    275.  Seeded   
#  2    275.  Seeded   
#  3    255   Seeded   
#  4    242.  Seeded   
#  5    201.  Seeded   
#  6    199.  Seeded   
#  7    130.  Seeded   
#  8    119   Seeded   
#  9    118.  Seeded   
# 10    115.  Seeded   
# 11     92.4 Seeded   
# 12     40.6 Seeded   
# 13     32.7 Seeded   
# 14     31.4 Seeded   
# 15     17.5 Seeded  

# end of script #

【讨论】:

    【解决方案2】:

    你需要一个临时人物来交换

    temp                <- cloud$Treatment[38]
    cloud$Treatment[38] <- cloud$Rainfall[38]
    cloud$Rainfall[38]  <- temp
    temp                <- NULL
    

    您也可以使用此方法更改拼写:

    cloud$Treatment[49] <- "Seeded"
    

    【讨论】:

    • 谢谢,当我尝试这个时,我收到了一条消息invalid factor level, NA generated,在cloud$Rainfall[38] 中以NA 发起,但应该有255
    【解决方案3】:

    使用一个较小的例子

    df <- data.frame(Rainfall=c('Seeded', '31.4'),
                     Treatment=c('255', 'Seded'),
                     stringsAsFactors = F)
    df
    
      Rainfall Treatment
    1   Seeded       255
    2     31.4     Seded
    

    一个可能的解决方案:

    # Swap values from col/col2 on row 1 (changing col order)
    df[1, c(1,2)] <- df[1, c(2,1)]
    # Rename Treatment value on row 2
    df[2, c("Treatment")] <- 'Seeded'
    
    df 
    
      Rainfall Treatment
    1      255    Seeded
    2     31.4    Seeded
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-06-17
      • 2011-12-06
      • 2018-08-17
      • 2022-01-17
      • 2016-12-03
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多