R：根据值自动转换列类型答案

【问题标题】：R: Automatically convert column types based on the valuesR：根据值自动转换列类型
【发布时间】：2019-12-24 00:47:37
【问题描述】：

我有一个包含很多列的小标题。我不想一一改变。假设表格如下所示：

df <- tibble(
  x = c(1,0,1,1,'a'), 
  y = c('A', 'B', 1, 'D', 'A'), 
  z = c(1/3, 4, 5/7, 100, 3)
)

我想根据其他 tibble 中的值转换它们的列类型：

df_map <- tibble(
  col = c('x','y','z'), 
  col_type = c('integer', 'string', 'float')
)

什么是最合适的解决方案？

【问题讨论】：

type.convert(df)
您的映射没有有效的 R 数据类型。
可能重复stackoverflow.com/questions/7680959/…
Convert type of multiple columns of a dataframe at once的可能重复

标签： r dataframe type-conversion tibble

【解决方案1】：

尝试以下方法：

library(purrr)
map2_dfc(df, df_map$col_type, type.convert, as.is = T)

此代码假定 df_map$col 与 names(df) 的顺序相同（感谢 @Moody_Mudskipper 指出这一点）。

正如@NelsonGon 所指出的，R 中合适的数据类型是“整数”、“字符”和“双精度”。

编辑以包含布尔变量的先前修改，如评论中所要求的：

library(tidyverse)
df %>% 
  mutate_if(~identical(sort(unique(.)), c(1,2)), ~{. - 1}) %>% 
  map2_dfc(df_map$col_type, type.convert, as.is = T)

【讨论】：

只想指出，由于 map2_dfc 来自 purrr 包，您可以只加载它而不是加载为 tidyverse 的所有内容
很好，谢谢！是否也可以在这里包含用户定义的函数？我有一些列的值为 1、2 作为布尔值，应该在转换之前减去 1？
您可以事先修改这些变量：dplyr::mutate_if(df, ~identical(sort(unique(.)), c(1,2)), ~{. - 1})
df_map$col 未使用，因此假定df 的列按df_map$col 排序
没错，我的代码只有在df_map$col 与names(df) 的顺序相同时才能正常工作。我将进行编辑以反映这一点

【解决方案2】：

我会使用包readr 来完成这样的任务，它是tidyverse 的一部分

suppressPackageStartupMessages(library(tidyverse))

# rework your col types to be compatible with ?readr::cols
df_map$col_type <- recode(df_map$col_type, integer = "i", float = "d" , string = "c")

# make a vector out of df_map
vec_map <- deframe(df_map)
vec_map
#>   x   y   z 
#> "i" "c" "d"

# convert according to your specs
type_convert(df,exec(cols, !!!vec_map))
#> Warning in type_convert_col(char_cols[[i]], specs$cols[[i]],
#> which(is_character)[i], : [4, 1]: expected an integer, but got 'a'
#> # A tibble: 5 x 3
#>       x y           z
#>   <int> <chr>   <dbl>
#> 1     1 A       0.333
#> 2     0 B       4    
#> 3     1 1       0.714
#> 4     1 D     100    
#> 5    NA A       3

【讨论】：

这比我的解决方案好