【发布时间】:2018-08-01 05:20:03
【问题描述】:
我有一个 DF:
my_data <- read.table(text =
"ID Date1 T1 Date2 Val1
A-1 '2018-01-10 15:05:24' A 2018-01-15 10
A-2 '2018-01-05 14:15:22' B 2018-01-14 12
A-3 '2018-01-04 13:20:21' A 2018-01-13 15
A-4 '2018-01-01 18:35:45' B 2018-01-12 22
A-5 '2017-12-28 19:45:10' A 2018-01-11 18
A-6 '2017-12-10 08:03:29' A 2018-01-10 21
A-7 '2017-12-06 20:55:55' A 2018-01-09 28
A-8 '2018-01-10 10:02:12' A 2018-01-15 10
A-9 '2018-01-05 17:15:14' B 2018-01-14 12
A-10 '2018-01-04 18:35:58' A 2018-01-13 15
A-11 '2018-01-01 21:09:25' B 2018-01-12 22
A-12 '2017-12-28 02:12:22' A 2018-01-11 18
A-13 '2017-12-10 03:45:44' A 2018-01-10 21
A-14 '2017-12-06 07:15:25' A 2018-01-09 28
A-18 '2017-10-07 08:02:84 B 2017-11-05 20
A-21 '2017-10-01 06:04:04 A 2017-10-20 15
A-51 '2017-09-20 08:07:06 A 2017-09-28 10
A-35 '2017-09-14 08:02:45 A 2017-09-25 20
A-30 '2017-08-10 15:03:08 A 2017-08-30 25",
header = TRUE, stringsAsFactors = FALSE)
并运行下面提到的代码,我得到如下所示的输出:
table_2 <- merge(
my_data %>%
mutate(Date2 = ymd(Date2)) %>%
arrange(Date2) %>%
mutate(Month = paste(month(ymd_hms(Date1), label = TRUE), year(Date1), sep = "-")) %>%
filter(T1 == "A") %>%
group_by(Month) %>%
summarise("# of A" = n(),
"sum of A" = sum(Val1)) %>%
mutate("MOM Growth # of A" = round(apply(cbind(`# of A`, lag(- `# of A`)),
1, sum, na.rm = TRUE) / lag(`# of A`), 2),
"MOM Growth sum of A" = round(apply(cbind(`sum of A`, lag(- `sum of A`)),
1, sum, na.rm = TRUE) / lag(`sum of A`) * 100, 2)) %>%
select(Month, `# of A`, `MOM Growth # of A`, `sum of A`, `MOM Growth sum of A`),
my_data %>%
mutate(Date2 = ymd(Date2)) %>%
arrange(Date2) %>%
mutate(Month = paste(month(ymd_hms(Date1), label = TRUE), year(Date1), sep = "-")) %>%
filter(T1 == "B") %>%
group_by(Month) %>%
summarise("# of B" = n(),
"sum of B" = sum(Val1)) %>%
mutate("MOM Growth # of B" = round(apply(cbind(`# of B`, lag(- `# of B`)),
1, sum, na.rm = TRUE) / lag(`# of B` * 100), 2),
"MOM Growth sum of B" = round(apply(cbind(`sum of B`, lag(- `sum of B`)),
1, sum, na.rm = TRUE) / lag(`sum of B`) * 100), 2) %>%
select(Month, `# of B`, `MOM Growth # of B`, `sum of B`, `MOM Growth sum of B`),
by = "Month",
all = TRUE)
table_2[is.na(table_2)] <- ""
输出(表_2):
现在我想在Status of A 头部和Median of B 下添加两列Median of A,Avg Time of A,在Status of B 头部添加Avg Time of B。并将这些输出转换为 htmltable 格式。
只是想知道如何调整 summarise 中的代码来计算每月数据的这些值。
此外,月份应在输出数据中按顺序排列,如果在 Max 月份和 Min 月份之间缺少任何月份,则该月份应具有除 MOM Growth 之外的所有值 0 Status of A和Status of B,因为这应该大于-100%。
【问题讨论】:
标签: r dataframe ggplot2 html-table dplyr