【问题标题】:Merge two rows in data.frame合并 data.frame 中的两行
【发布时间】:2017-07-25 09:20:09
【问题描述】:

类似于Merge rows in one data.frameMerge two rows in one dataframe, when the rows are disjoint and contain nulls 我面临以下问题,上述帖子可以真正帮助解决。

我的数据是这样的

| Date     | Checkin | Origin | Checkout | Destination |
| 03-07-17 | 08:00   | A      |          |             |
| 03-07-17 |         | A      | 09:00    | B           |
| 03-07-17 | 17:00   | B      |          |             |
| 03-07-17 |         | B      | 18:00    | A           |
| 04-07-17 | 08:00   | A      |          |             |
| 04-07-17 |         | A      | 09:00    | B           |
| 04-07-17 | 17:00   | B      |          |             |
| 04-07-17 |         | B      | 18:00    | A           |

现在我想将它聚合成 4 行,如下所示:

| Date     | Checkin | Origin | Checkout | Destination |
| 03-07-17 | 08:00   | A      | 09:00    | B           |
| 03-07-17 | 17:00   | B      | 18:00    | A           |
| 04-07-17 | 08:00   | A      | 09:00    | B           |
| 04-07-17 | 17:00   | B      | 18:00    | A           |

有什么想法吗? 谢谢!

【问题讨论】:

  • 也许你可以试试 my_data$Checkout

标签: r


【解决方案1】:

一个想法来自dplyr

library(dplyr)

df %>% 
 group_by(Date, Origin) %>% 
 summarise_all(funs(trimws(paste(., collapse = ''))))
 A tibble: 4 x 5
 Groups:   Date [?]
        Date   Origin Checkin Checkout Destination
       <chr>    <chr>   <chr>    <chr>       <chr>
1  03-07-17   A         08:00    09:00           B
2  03-07-17   B         17:00    18:00           A
3  04-07-17   A         08:00    09:00           B
4  04-07-17   B         17:00    18:00           A

数据

dput(df)
structure(list(Date = c(" 03-07-17 ", " 03-07-17 ", " 03-07-17 ", 
" 03-07-17 ", " 04-07-17 ", " 04-07-17 ", " 04-07-17 ", " 04-07-17 "
), Checkin = c(" 08:00   ", "         ", " 17:00   ", "         ", 
" 08:00   ", "         ", " 17:00   ", "         "), Origin = c(" A      ", 
" A      ", " B      ", " B      ", " A      ", " A      ", " B      ", 
" B      "), Checkout = c("          ", " 09:00    ", "          ", 
" 18:00    ", "          ", " 09:00    ", "          ", " 18:00    "
), Destination = c("             ", " B           ", "             ", 
" A           ", "             ", " B           ", "             ", 
" A           ")), .Names = c("Date", "Checkin", "Origin", "Checkout", 
"Destination"), row.names = c(NA, -8L), class = "data.frame")

【讨论】:

  • @Sotos:只是想破译你变出的魔法:grouping_byDate & Originsummarize_all() 将应用于每个非分组列,它会修剪尾随将值粘贴到列的所有行中后的空格(“左”和“右”)。粘贴将两行(在本例中)作为输入?只是想确认我是否正确?
【解决方案2】:

如果您的数据与上述结构完全一样,并且您对此有很高的确定性,则可以在base R中使用以下内容。

cbind(dat[c(TRUE,FALSE), 1:3], dat[c(FALSE, TRUE), 4:5])
        Date   Checkin   Origin   Checkout   Destination
1  03-07-17   08:00     A        09:00      B           
3  03-07-17   17:00     B        18:00      A           
5  04-07-17   08:00     A        09:00      B           
7  04-07-17   17:00     B        18:00      A 

这个想法是为第 1 列到第 3 列取奇数行 (1, 3, 5),然后为第 4 列和第 5 列附加偶数行 (2, 4, 6)。

如果任何一行乱序或没有一对,这将不起作用。

【讨论】:

    【解决方案3】:

    更多的是一种迂回的方式,尽管它不需要使用 dplyr。我不确定您的任何课程是基于您的示例,我将表格粘贴到 excel 中并将其保存为 .csv,然后按照它给我的内容进行操作。无论如何,如果您确保“空”索引实际上是空的,那么您可以使用完整的案例。

    setwd(Your Working directory)
    data = read.csv("exampledata.csv")
    
    data$Date<-as.Date(data$Date,format='%m/%d/%Y')
    data$Checkin<-as.character(data$Checkin)
    data$Checkin[data$Checkin==""]<-NA
    
    data$Checkout<-as.character(data$Checkout)
    data$Checkout[data$Checkout==""]<-NA
    
    checkIns<-data[complete.cases(data$Checkin),]
    checkIns$Destination[checkIns$Destination==""]<-NA
    
    checkOuts<-data[complete.cases(data$Checkout),]
    
    data2<-merge(checkIns,checkOuts,by=c("Date","Origin"))
    data2 <- data2[,colSums(is.na(data2))<nrow(data2)]
    head<-colnames(data)
    colnames(data2)<-head
    
    data2
    

    这产生了:

    > data2
          Date Checkin Origin Checkout Destination
    1 3/7/2017       A   8:00     9:00           B
    2 3/7/2017       B  17:00    18:00           A
    3 4/7/2017       A   8:00     9:00           B
    4 4/7/2017       B  17:00    18:00           A
    

    【讨论】:

      猜你喜欢
      • 2014-08-12
      • 2016-01-22
      • 2018-06-13
      • 2015-07-11
      • 2022-11-18
      • 2013-12-04
      • 2017-03-14
      • 1970-01-01
      • 2013-01-24
      相关资源
      最近更新 更多