【发布时间】:2020-07-08 03:22:10
【问题描述】:
data1=data.frame(Year=c(2010,2010,2010,2011,2011,2011,2010,2010,2010,2011,2011,2011),
Group=c(1,1,1,1,1,1,2,2,2,2,2,2),
Class=c('A','B','C','A','B','C','A','B','C','A','B','C'),
A=c(0.73,0.55,0.54,0.49,0.52,0.49,0.26,0.55,0.39,0.34,0.84,0.29),
B=c(0.12,0.08,0.14,0.21,0.33,0.98,0.33,0.99,0.02,0.59,0.27,0.72),
C=c(0.43,0.51,0.29,0.6,0.28,0.97,0.78,0.84,0.34,0.82,0.75,0.97))
##>data1
## Year Group Class A B C
## 1 2010 1 A 0.73 0.12 0.43
## 2 2010 1 B 0.55 0.08 0.51
## 3 2010 1 C 0.54 0.14 0.29
## 4 2011 1 A 0.49 0.21 0.60
## 5 2011 1 B 0.52 0.33 0.28
## 6 2011 1 C 0.49 0.98 0.97
## 7 2010 2 A 0.26 0.33 0.78
## 8 2010 2 B 0.55 0.99 0.84
## 9 2010 2 C 0.39 0.02 0.34
## 10 2011 2 A 0.34 0.59 0.82
## 11 2011 2 B 0.84 0.27 0.75
## 12 2011 2 C 0.29 0.72 0.97
我有“data1”并希望制作“data2”。 “data2”将具有与“data1”相同的精确尺寸,但我希望制定以下条件,
如果类 = 'A',则 'B' 列 = (1-B)*0.05,'C' 列 = (1-C)*0.05, 在更新 Column 'B' 和 Column 'C' 之后,我们计算 Column 'A' = 1- (B+C)。
如果类 = 'B',则 'A' 列 = (1-A)*0.05,'C' 列 = (1-C)*0.05, 并且在更新 Column 'A' 和 Column 'C' 之后,我们计算 Column 'B' = 1- (A+C)。
如果类 = 'C',则 'A' 列 = (1-A)*0.05,'B' 列 = (1-B)*0.05,然后 > 更新 'A' 列和 'B' 列,我们计算列 'C' = 1- (A+B)。
我对高效的 data.table 解决方案抱有希望,因为我有非常大的数据集,其中的“类”多于 3 个。
这是进行有希望的更新的缓慢解决方案。
library(data.table)
setDT(data1)
data1[, newB := fifelse(Class == 'A', (1-B) * 0.05, NA_real_)]
data1[, newC := fifelse(Class == 'A', (1-C) * 0.05, NA_real_)]
data1[, newA := fifelse(Class == 'A', (1-(newB+newC)), NA_real_)]
data1[, newA := fifelse(Class == 'B', (1-A) * 0.05, newA)]
data1[, newC := fifelse(Class == 'B', (1-C) * 0.05, newC)]
data1[, newB := fifelse(Class == 'B', (1-(newA+newC)), newB)]
data1[, newA := fifelse(Class == 'C', (1-A) * 0.05, newA)]
data1[, newB := fifelse(Class == 'C', (1-B) * 0.05, newB)]
data1[, newC := fifelse(Class == 'C', (1-(newA+newB)), newC)]
【问题讨论】:
-
@akrun 非常适合 data.table 解决方案
标签: r data.table