【问题标题】:Convert Excel subtotal rows to columns in R data frames将 Excel 小计行转换为 R 数据框中的列
【发布时间】:2015-04-15 08:27:10
【问题描述】:

我正在尝试读入包含按员工分组的时间输入行的 R Excel 电子表格,当分组折叠时看起来像这样(此处使用逗号分隔列):

Column A    Column B

Alice

2015-01-01  8
2015-01-02  7.5
2015-01-03  6

Bob

2015-01-02  6
2015-01-03  8

我可以使用 xlsx::read.xlsx2 函数将电子表格读入数据框,但我无法弄清楚如何将小计行转换为列,因此数据框如下所示:

Alice   2015-01-01  8
Alice   2015-01-02  7.5
Alice   2015-01-03  6
Bob     2015-01-02  6
Bob     2015-01-03  8

我尝试查看reshapedplyr,但我不知道他们是否可以提供帮助。有人可以指出我正确的方向吗?

【问题讨论】:

  • 小计是什么意思?名字?
  • 你能用制表符格式化初始表,以区分列吗?我相信 tidyr gather 适合这个。
  • 每个 Alice、Bob 等的时间条目数是否相同?
  • 日期是否与 Alice 在同一列?如果不是,请查看 zoo 包中的 na.locf

标签: r excel


【解决方案1】:

这可能有帮助

library(dplyr)
library(tidyr)
#read the file using `readLines`
lines <- readLines('file.csv')
#remove the empty elements
lines1 <- lines[lines!='']
#create a grouping index based on the occurrence of non-numeric elements 
indx <- cumsum(grepl('^[A-Za-z]', lines1))
#create another index for finding the position of non-numeric element 
indx1 <- grep('^[A-Za-z]', lines1)
#split the lines based on the grouping index
lst <- setNames(split(lines1[-indx1], indx[-indx1]), lines1[indx1])
#use unnest from tidyr and split the `x` column into two
unnest(lst, Name) %>% 
           extract(x, c('Date', 'val'), '(.*),(.*)', convert=TRUE)
#   Name       Date val
#1 Alice 2015-01-01   8
#2 Alice 2015-01-02 7.5
#3 Alice 2015-01-03   6
#4   Bob 2015-01-02   6
#5   Bob 2015-01-03   8

或者你可以使用base R

#read the data using `read.csv` or `read.xlsx2`.  Here `,` is the delimiter
d1 <- read.csv('file.csv', header=FALSE, stringsAsFactors=FALSE)
#second column `V2` will have `NAs` for corresponding words in `V1`
indx <- is.na(d1$V2)
#subset the dataset by removing the `NA` rows 
d2 <- d1[!indx,]
#use one of the aggregating functions
#remove the first element for each group  
d2$names <-  unlist(tapply(rep(d1$V1[indx], tabulate(cumsum(indx))), 
             cumsum(indx), FUN=tail,-1), use.names=FALSE)
d2
#         V1  V2 names
#2 2015-01-01 8.0 Alice
#3 2015-01-02 7.5 Alice
#4 2015-01-03 6.0 Alice
#6 2015-01-02 6.0   Bob
#7 2015-01-03 8.0   Bob

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-03-05
    • 2016-09-19
    • 2023-03-29
    • 1970-01-01
    • 2018-02-21
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多