【发布时间】:2021-12-28 06:17:32
【问题描述】:
我正在使用 R 编程语言。假设,我有以下数据框:
var_1 = rnorm(100,10,10)
var_2 = rnorm(100,10,10)
var_3 = rnorm(100,10,10)
d = data.frame(var_1, var_2, var_3)
head(d)
var_1 var_2 var_3
1 14.251923 14.877801 22.636207
2 7.325137 8.513718 21.021522
3 3.400001 -3.400397 11.274797
4 16.400597 8.623980 9.366115
5 7.065583 13.155570 17.891432
6 21.297912 4.341385 -11.337330
我的问题:对于每个变量中的每个元素,我想将元素替换为它所属的百分位数。
例如:
a = quantile(d$var_1, c( 0.15, 0.3, 0.35, 0.45, 0.5, 0.65, 0.7, 0.8, 0.85, 0.9, 0.95, 1))
b = quantile(d$var_2, c(0.16, 0.23, 0.65, 0.71, 0.95))
c = quantile(d$var_3, c(0.15, 0.28, 0.7, 0.73, 0.87))
> a
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75%
-0.8806901 0.3595086 1.1201300 3.0581928 5.0901641 7.0056228 7.6089831 8.9853805 9.9264540 10.2235212 11.5707533 13.2422940 15.1076889 16.5354881 17.9336020
80% 85% 90% 95% 100%
19.5312682 21.9264905 24.4511364 26.6820271 41.4419744
> b
16% 23% 65% 71% 95%
-2.795294 1.430715 11.070815 12.688064 25.270823
> c
15% 28% 70% 73% 87%
0.958404 5.767591 15.258532 16.013648 20.467892
例如:
- 如果
d$var_2 < -2.795294,那么d$var_2 = 16th percentile - 如果
d$var_3 between (5.767591 , 15.258532),那么d$var_3 = 70th percentile
我可以手动编写多个“if 语句”,但有更快的方法吗?
谢谢!
【问题讨论】:
标签: r data-manipulation