【发布时间】:2022-01-05 09:35:23
【问题描述】:
我一直在尝试将一些数据从长格式重塑为宽格式。我对每个唯一 ID 有一行感兴趣。为了模仿我的要求,我创建了一个示例输入和所需的输出,如下所示:
输入:
id date size category name type
124 3.1 1 fruit apple royalGala
327 1.1 0 veg chilli green
124 2.1 2 fruit apple green
124 1.2 1 fruit apple jazz
124 2.2 2 fruit apple eve
124 2.1 3 fruit apple pinkLady
327 1.2 1 veg chilli red
327 1.2 2 veg chilli Jalapeño
327 1.2 3 veg chilli bananaPepper
327 3.3 1 veg chilli fresnoPepper
输出:
id fruit_apple_royalGala_date fruit_apple_royalGala_size fruit_apple_green_date fruit_apple_green_size fruit_apple_jazz_date fruit_apple_jazz_size fruit_apple_eve_date fruit_apple_eve_size fruit_apple_pinkLady_date fruit_apple_pinkLady_size veg_chilli_green_date veg_chilli_green_size veg_chilli_red_date veg_chilli_red_size veg_chilli_Jalapeño_date veg_chilli_Jalapeño_size veg_chilli_bananaPepper_date veg_chilli_bananaPepper_size veg_chilli_fresnoPepper_date veg_chilli_fresnoPepper_size
124 3.1 1 2.1 2 1.2 1 2.2 2 2.1 3 NA NA NA NA NA NA NA NA NA NA
327 NA NA NA NA NA NA NA NA NA NA 1.1 0 1.2 1 1.2 2 1.2 3 3.3 1
我不确定如何实现所需的输出。我在 StackOverflow 上查看了一些相关问题,但没有一个可以帮助我解决这个问题,例如 Convert data from long format to wide format with multiple measure columns、From long to wide data with multiple columns 和 Reshape multiple value columns to wide format。
我从昨天开始就一直在研究这个问题,但是对于收集和传播等方面的经验很少,我一直无法解决它。我将不胜感激任何帮助。
谢谢!
输入()
为了方便,我也是复制dput()
structure(list(
id = c(124L, 327L, 124L, 124L, 124L, 124L, 327L, 327L, 327L, 327L),
date = c(3.1, 1.1, 2.1, 1.2, 2.2, 2.1, 1.2, 1.2, 1.2, 3.3),
size = c(1L, 0L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 1L),
category = c("fruit", "veg", "fruit", "fruit", "fruit", "fruit", "veg", "veg", "veg", "veg"),
name = c("apple", "chilli", "apple", "apple", "apple", "apple", "chilli", "chilli", "chilli", "chilli"),
type = c("royalGala", "green", "green", "jazz", "eve", "pinkLady", "red", "Jalapeño", "bananaPepper", "fresnoPepper")),
class = "data.frame", row.names = c(NA, -10L))
部分解决方案
我有一个解决方法来解决这个问题,但是当我在我的原始数据上运行它时我的解决方案失败了。
# Read the csv file
df = read.csv("C:/Desktop/test.csv")
# Unite multiple columns in to one
df_unite = df %>%
unite("info", category:type, remove = TRUE)
# Conversion from long into wide format
setDT(df_unite) # coerce to data.table
df_wide <- dcast(df_unite, id ~ info,
value.var = c("date", "size"))
【问题讨论】:
标签: r multiple-columns reshape tidyr