【发布时间】:2020-02-05 18:17:43
【问题描述】:
我有以下玩具数据集:
dt <- read.table(text = "
Date Model Color Value Samples
1/29/2020 6:51:19 AM Gold Blue 0.5 500
1/29/2020 7:57:47 AM Gold Red 0.0 449
1/29/2020 3:39:04 PM Silver Blue 0.75 1320
1/29/2020 5:04:32 PM Silver Blue 1.5 103
1/29/2020 10:32:39 AM Gold Red 0.7 891
1/30/2020 1:02:12 AM Gold Blue 0.41 18103
1/30/2020 4:30:00 AM Copper Blue 0.83 564
1/30/2020 9:09:45 AM Silver Pink 1.17 173
1/30/2020 2:19:30 PM Platinum Brown 0.43 793
1/30/2020 4:43:32 PM Platinum Red 0.71 1763
1/30/2020 7:19:00 PM Gold Orange 1.92 503",
header = TRUE, stringsAsFactors = FALSE)
然后我拿这个data.table,生成一些百分位数据,如下:
qs = dt[Value > 0, .(Samples = sum(Samples),
'50th' = quantile(Value, probs = c(0.50)),
'75th' = quantile(Value, probs = c(0.75)),
'90th' = quantile(Value, probs = c(0.90)),
'99th' = quantile(Value, probs = c(0.99))),
by = .(Model, Color)]
setkey(qs, 'Model')
最后,我将结果输出到 .csv 文件:
#outputs to csv file
write.csv(qs, file = "outfile.csv")
问题:我将如何编写结果以便:
a) 结果按日期细分(即只取日期,例如 2020 年 1 月 30 日和 2020 年 1 月 31 日,不包括时间) b) 日期写成行
例如(注意:下面的值是玩具数据,而不是真正的计算...只是想显示“日期”列的表示方式):
# Model Color Samples 50th 99th 99.9th 99.99th Date
# 1: Copper Blue 564 0.830 0.8300 0.83000 0.830000 01/29/2020
# 2: Gold Blue 18603 0.455 0.4991 0.49991 0.499991 01/29/2020
# 3: Gold Red 891 0.700 0.7000 0.70000 0.700000 01/29/2020
# 4: Gold Orange 503 1.920 1.9200 1.92000 1.920000 01/29/2020
# 5: Platinum Brown 793 0.430 0.4300 0.43000 0.430000 01/29/2020
# 6: Platinum Red 1763 0.710 0.7100 0.71000 0.710000 01/29/2020
# 7: Silver Blue 1423 1.125 1.4925 1.49925 1.499925 01/29/2020
# 8: Silver Pink 173 1.170 1.1700 1.17000 1.170000 01/29/2020
# 9: Copper Blue 564 0.830 0.8300 0.83000 0.830000 01/30/2020
#10: Gold Blue 18603 0.455 0.4991 0.49991 0.499991 01/30/2020
#11: Gold Red 891 0.700 0.7000 0.70000 0.700000 01/30/2020
#12: Gold Orange 503 1.920 1.9200 1.92000 1.920000 01/30/2020
#13: Platinum Brown 793 0.430 0.4300 0.43000 0.430000 01/30/2020
#14: Platinum Red 1763 0.710 0.7100 0.71000 0.710000 01/30/2020
#15: Silver Blue 1423 1.125 1.4925 1.49925 1.499925 01/30/2020
#16: Silver Pink 173 1.170 1.1700 1.17000 1.170000 01/30/2020
谢谢!
【问题讨论】:
标签: r dataframe data.table