【发布时间】:2016-08-29 00:30:06
【问题描述】:
我的数据框有一个列,我想用破折号分隔,在破折号的左侧和右侧带有字符的重复行。我知道如何拆分和复制,但不知道如何保留部分字符串。非常糟糕的描述 - 我认为显示数据框和所需输出更容易。
tmp = structure(list(Unit.Types = c("10 - 12 Pack 11.2 - 14.9 oz Bottle or Can",
"8 - 12 Pack 11.5 - 16 oz Bottle or Can"), Row.Count = c("899",
"305"), Test = c("B", "A")), .Names = c("Unit.Types", "Row.Count",
"Test"), row.names = c(104L, 196L), class = "data.frame")
library(tidyr)
library(dplyr)
tmp2 = tmp %>% mutate(Unit.Types = strsplit(as.character(Unit.Types), "-")) %>% unnest(Unit.Types)
tmp2
Row.Count Test Unit.Types
1 899 B 10
2 899 B 12 Pack 11.2
3 899 B 14.9 oz Bottle or Can
4 305 A 8
5 305 A 12 Pack 11.5
6 305 A 16 oz Bottle or Can
我想要的输出应该是这样的:
Unit.Types Row.Count Test
1 10 Pack 11.2 oz Bottle or Can 899 B
2 10 Pack 14.9 oz Bottle or Can 899 B
3 12 Pack 11.2 oz Bottle or Can 899 B
4 12 Pack 14.9 oz Bottle or Can 899 B
5 8 Pack 11.5 oz Bottle or Can 305 A
6 8 Pack 16 oz Bottle or Can 305 A
7 12 Pack 11.5 oz Bottle or Can 305 A
8 12 Pack 16 oz Bottle or Can 305 A
或者至少是这样,用“oz”用破折号分隔
Unit.Types Row.Count Test
1 10 - 12 Pack 11.2 oz Bottle or Can 899 B
2 10 - 12 Pack 14.9 oz Bottle or Can 899 B
3 8 - 12 Pack 11.5 oz Bottle or Can 305 A
4 8 - 12 Pack 16 oz Bottle or Can 305 A
非常感谢任何帮助!
【问题讨论】:
-
所有行都是“10 - 12 Pack 11.2 - 14.9 oz Bottle or Can”形式的吗?
-
也可以是“10 Pack 14 - 16 oz Can”