【发布时间】:2017-06-22 15:53:26
【问题描述】:
我想要每个组日期的对角乘法。
主要数据集:
date Bucket D
1/31/2013 bkt 0 NA
1/31/2013 bkt 1(10-20) NA
1/31/2013 bkt 2(20-30) NA
1/31/2013 bkt 3(30-40) NA
1/31/2013 bkt 4(40+) NA
2/28/2013 bkt 0 NA
2/28/2013 bkt 1(10-20) 3.00
2/28/2013 bkt 2(20-30) 3.63
2/28/2013 bkt 3(30-40) 101
2/28/2013 bkt 4(40+) 102
3/30/2013 bkt 0 NA
3/30/2013 bkt 1(10-20) 0.55
3/30/2013 bkt 2(20-30) 0.40
3/30/2013 bkt 3(30-40) 103
3/30/2013 bkt 4(40+) 104
4/31/2013 bkt 0 NA
4/31/2013 bkt 1(10-20) 4.25
4/31/2013 bkt 2(20-30) 3.65
4/31/2013 bkt 3(30-40) 105
4/31/2013 bkt 4(40+) 106
5/30/2013 bkt 0 NA
5/30/2013 bkt 1(10-20) 2.34
5/30/2013 bkt 2(20-30) 4.10
5/30/2013 bkt 3(30-40) 107
5/30/2013 bkt 4(40+) 108
6/31/2013 bkt 0 NA
6/31/2013 bkt 1(10-20) 4
6/31/2013 bkt 2(20-30) 5
6/31/2013 bkt 3(30-40) 109
6/31/2013 bkt 4(40+) 110
7/30/2013 bkt 0 NA
7/30/2013 bkt 1(10-20) 8
7/30/2013 bkt 2(20-30) 7
7/30/2013 bkt 3(30-40) 111
7/30/2013 bkt 4(40+) 112
Diagonal multiplication is as follows:
1/31/2013 to 5/30/2013
2/28/2013 to 6/31/2013
3/30/2013 to 7/30/2013
each time we incrementing the next group of dates for diagonal product.
as so on ... the dates ranges from 1/31/2013 to 12/31/2016.
预期输出:
date Bucket D DP
1/31/2013 bkt 0 NA
1/31/2013 bkt 1(10-20) NA
1/31/2013 bkt 2(20-30) NA
1/31/2013 bkt 3(30-40) NA
1/31/2013 bkt 4(40+) NA
2/28/2013 bkt 0 NA
2/28/2013 bkt 1(10-20) 3.00
2/28/2013 bkt 2(20-30) 3.63
2/28/2013 bkt 3(30-40) 101
2/28/2013 bkt 4(40+) 102
3/30/2013 bkt 0 NA
3/30/2013 bkt 1(10-20) 0.55
3/30/2013 bkt 2(20-30) 0.40
3/30/2013 bkt 3(30-40) 103
3/30/2013 bkt 4(40+) 104
4/31/2013 bkt 0 NA
4/31/2013 bkt 1(10-20) 4.25
4/31/2013 bkt 2(20-30) 3.65
4/31/2013 bkt 3(30-40) 105
4/31/2013 bkt 4(40+) 106
5/30/2013 bkt 0 NA
5/30/2013 bkt 1(10-20) 2.34 13608 (108 * 105 * 0.40 * 3.00)
5/30/2013 bkt 2(20-30) 4.10 4536 (4.10 108 * 105 * 0.40)
5/30/2013 bkt 3(30-40) 107 11340 (108 * 105)
5/30/2013 bkt 4(40+) 108 108 (108)
6/31/2013 bkt 0 NA
6/31/2013 bkt 1(10-20) 4 23628.275 (110 * 107 * 3.65 * 0.55)
6/31/2013 bkt 2(20-30) 5 42960.5 (110 * 107 * 3.65)
6/31/2013 bkt 3(30-40) 109 1170 (110 * 109 )
6/31/2013 bkt 4(40+) 110 110 (100)
7/30/2013 bkt 0 NA
7/30/2013 bkt 1(10-20) 8 216627.6 (112 * 109 * 4.10 * 4.25)
7/30/2013 bkt 2(20-30) 7 50971.2 (112 * 109 * 4.10)
7/30/2013 bkt 3(30-40) 111 12432 (112 * 109)
7/30/2013 bkt 4(40+) 112 112 (112)
在输出中,我们只需要显示这些列:Date、Bucket、D 和 DP,因为 DP 是相乘的结果。 () 中的任何内容仅用于解释 Result 来了。无需在列中显示。
有错误的用户代码:
d <- read.csv("lossrate.csv", header=TRUE)
> d$date = as.Date(d$date, format="%m/%d/%Y")
> r <- reshape2::dcast(data=d, Bucket ~ date, value.var="D")[-1, -2]
Aggregation function missing: defaulting to length
> mat <- as.matrix(r[-1])
> myD <- col(mat) - row(mat)
> rg <- range(myD)
> out <- sapply(seq(rg[1], rg[2]), function(x)
+ `length<-`(rev(cumprod(rev(mat[myD==x]))), nrow(mat)))[,1:ncol(mat)]
> out[, colSums(is.na(out)) > 0] <- NA
> colnames(out) <- colnames(mat) # add dates as headers
> out <- reshape2::melt(cbind(r[1], out))
Using Bucket as id variables
> out <- merge(d, out, by.x=c("date", "Bucket"), by.y=c("variable", "Bucket"), all=TRUE)
> output:
date Bucket D value
1 2013-01-31 bkt 0 NA NA
2 2013-01-31 bkt 1(10-20) NA NA
3 2013-01-31 bkt 2(20-30) NA NA
4 2013-01-31 bkt 3(30-40) NA NA
5 2013-01-31 bkt 4(40+) NA NA
6 2013-02-28 bkt 0 NA NA
7 2013-02-28 bkt 1(10-20) 3.00 NA
8 2013-02-28 bkt 2(20-30) 3.63 NA
9 2013-02-28 bkt 3(30-40) 101.00 NA
10 2013-02-28 bkt 4(40+) 102.00 NA
11 2013-03-30 bkt 0 NA NA
12 2013-03-30 bkt 1(10-20) 0.55 NA
13 2013-03-30 bkt 2(20-30) 0.40 NA
14 2013-03-30 bkt 3(30-40) 103.00 NA
15 2013-03-30 bkt 4(40+) 104.00 NA
16 2013-05-30 bkt 0 NA NA
17 2013-05-30 bkt 1(10-20) 2.34 NA
18 2013-05-30 bkt 2(20-30) 4.10 NA
19 2013-05-30 bkt 3(30-40) 107.00 NA
20 2013-05-30 bkt 4(40+) 108.00 NA
21 2013-07-30 bkt 0 NA NA
22 2013-07-30 bkt 1(10-20) 8.00 1
23 2013-07-30 bkt 2(20-30) 7.00 1
24 2013-07-30 bkt 3(30-40) 111.00 1
25 2013-07-30 bkt 4(40+) 112.00 1
26 <NA> bkt 0 NA NA
27 <NA> bkt 0 NA NA
28 <NA> bkt 1(10-20) 4.25 2
29 <NA> bkt 1(10-20) 4.00 2
30 <NA> bkt 2(20-30) 5.00 2
31 <NA> bkt 2(20-30) 3.65 2
32 <NA> bkt 3(30-40) 109.00 2
33 <NA> bkt 3(30-40) 105.00 2
34 <NA> bkt 4(40+) 106.00 2
35 <NA> bkt 4(40+) 110.00 2
我只是根据我的数据集更改了 csv 文件的名称。
【问题讨论】:
-
四月没有 31 天;P(六月也没有)
-
它只是示例数据集 :P 无论如何都很好 :D 希望你能够理解它背后的逻辑;)
-
我试过了,它不工作,请你具体。你能告诉我你试过的代码吗?
-
你之前问过这个问题...
7x5矩阵的对角积是什么?正如我之前评论的那样,您的矩阵应该是对称的...MxM,而不是MxN -
@ChiPak ;它不需要是对称的。试试
mat = matrix(1:15, nc=5) ; col(mat) - row(mat)第二个矩阵表示对角线
标签: r dataframe data.table dplyr