如何在R中重构R数据框[重复]答案

【问题标题】：How to Restructure R Data Frame in R [duplicate]如何在R中重构R数据框[重复]
【发布时间】：2017-07-13 22:00:14
【问题描述】：

我有这种格式的数据：

   boss employee1 employee2
1   wil     james      andy
2 james      dean      bert
3 billy      herb    collin
4  tony      mike     david

我想要这种格式：

   boss employee
1   wil    james
2   wil     andy
3 james     dean
4 james     bert
5 billy     herb
6 billy   collin
7  tony     mike
8  tony    david

我已经搜索了论坛，但我还没有找到任何有用的东西。我尝试过使用dplyr 和其他一些，但我对 R 还是很陌生。

如果这个问题已经得到解答，您可以给我一个链接，我们将不胜感激。

谢谢，

威尔

【问题讨论】：

嘿，版主，我们可以得到一个永久重复标志来进行从宽到长的整形吗？这一直出现
我会说，在查看您发布的链接后，Scarabee 和 Nick Criswell 提供的答案比链接的答案要清晰得多。作为一个R比较新的人，链接的帖子很密集，代码很多，而这里提供的答案更清晰，也很简洁。
两个基本的 R 替代方案：使用 reshape 沿着链接帖子中 tyler-rink 的答案：reshape(df, direction="long", idvar="boss", varying=2:3, sep="")[-2]。那么stack 和cbind 的蛮力风格是cbind(boss=df[1], employees=stack(df[-1])$values)。这会返回一个警告，但运行正常。
@Tucktuckgoose 我理解并感谢您的评论；我没有找到我希望的副本——这解释得更清楚了。 但是：该社区的使用政策之一是首先搜索您的答案，而不是简单地发布重复的问题。新问题应该为知识库增加一些东西，而不是简单地使寻找好答案的过程复杂化...... 也就是说，理想状态是一个单一的、容易找到的、高质量的答案。它不是很多质量较低、很难找到的答案。祝你继续学习 R 的好运。

标签： r

【解决方案1】：

这是一个使用tidyr 的解决方案。具体来说，gather 函数用于合并两个employee 列。这还会在列标题（employee1 和employee2）上生成一个列，称为key。我们用select 从dplyr 中删除它。

library(tidyr)
library(dplyr)

df <- read.table(
      text = "boss employee1 employee2
      1   wil     james      andy
      2 james      dean      bert
      3 billy      herb    collin
      4  tony      mike     david",
      header = TRUE,
      stringsAsFactors = FALSE
    )


    df2 <- df %>%
      gather(key, employee, -boss) %>%
      select(-key)

> df2
   boss employee
1   wil    james
2 james     dean
3 billy     herb
4  tony     mike
5   wil     andy
6 james     bert
7 billy   collin
8  tony    david

如果没有更流畅的基本解决方案，我会感到震惊，但这应该对您有用。

【讨论】：

【解决方案2】：

使用基础 R：

df1 <- df[, 1:2]
df2 <- df[, c(1, 3)]
names(df1)[2] <- names(df2)[2] <- "employee"
rbind(df1, df2)
#     boss employee
# 1    wil    james
# 2  james     dean
# 3  billy     herb
# 4   tony     mike
# 11   wil     andy
# 21 james     bert
# 31 billy   collin
# 41  tony    david

使用 dplyr：

df %>% 
  select(boss, employee1) %>% 
  rename(employee = employee1) %>% 
  bind_rows(df %>% 
              select(boss, employee2) %>% 
              rename(employee = employee2))
#    boss employee
# 1   wil    james
# 2 james     dean
# 3 billy     herb
# 4  tony     mike
# 5   wil     andy
# 6 james     bert
# 7 billy   collin
# 8  tony    david

数据：

df <- read.table(text = "
   boss employee1 employee2
1   wil     james      andy
2 james      dean      bert
3 billy      herb    collin
4  tony      mike     david                 
                 ", header = TRUE, stringsAsFactors = FALSE)

【讨论】：