【发布时间】:2019-12-09 18:24:45
【问题描述】:
我正在尝试创建一个时间序列,以显示特定列在特定时间的值。我目前只能访问一个记录所有更改、列的当前值、日期和被更改的列的名称的表。我想创建一个新列来跟踪该列的先前值在更改之前是什么。 “Column_name”中引用的更改日志中有超过 63 个不同的列
这是我目前拥有的
________________________________________________
Name | date |A | B |C |NEW | Column_name|
bob | 12302019|2 | 23 |153|2 | a |
bob | 12102019|2 | 23 |153|362 | a |
bob | 10242019|2 | 23 |153|7 | a |
john | 10062017|684| 452|1 |254 | c |
john | 11052018|684| 452|1 |1 | c |
________________________________________________
这就是我想要的帮助创建
_________________________________________________________________________________
Name | date |A | B |C |NEW | Column_name| a_ at Date| b_ at Date | c_ at Date |
bob | 12302019|2 | 23 |153|2 | a |2 | 23 | 153 |
bob | 12102019|2 | 23 |153|362 | a |362 | 23 | 153 |
bob | 10242019|2 | 23 |153|7 | a |7 | 23 | 153 |
john | 10062017|684| 452|1 |254 | c |684 | 452 | 254 |
john | 11052018|684| 452|1 |1 | c |684 | 452 | 1 |
______________________________________________________________________________________
I have tested the solution on the following test Data frame, where there is only one column Name "A" and it has several factors
'data.frame': 755 obs. of 5 variables:
$ name : int 606765182 83595892 538663788 779873188 957405600 522796409 41212559 145402647 304688204 83595892 ...
$ date : POSIXct, format: "2019-11-01" "2019-11-01" "2019-10-21" ...
$ A : Factor
$ B : Factor
$ C : Factor
$ Column_name: Factor w/ 1
$ NEW : Factor w/ 8
【问题讨论】:
-
为什么大写的列名然后小写的列发生了变化?有区别吗?
-
不,真正重要的是添加尾部以区分计算列与原始列
标签: r dplyr tidyr data-munging