【发布时间】:2020-12-16 19:06:52
【问题描述】:
有谁知道如何根据 ID 随机删除和添加行?这是一个可重现的示例:
> y <- rnorm(20)
> x <- rnorm(20)
> z <- rep(1:5, 4)
> w <- rep(1:4, each=5)
> data.frame(id=z,cluster=w,x=x,y=y) #this is a balanced dataset
id cluster x y
1 1 1 0.30003855 0.65325768
2 2 1 -1.00563626 -0.12270866
3 3 1 0.01925927 -0.41367651
4 4 1 -1.07742065 -2.64314895
5 5 1 0.71270333 -0.09294102
6 1 2 1.08477509 0.43028470
7 2 2 -2.22498770 0.53539884
8 3 2 1.23569346 -0.55527835
9 4 2 -1.24104450 1.77950291
10 5 2 0.45476927 0.28642442
11 1 3 0.65990264 0.12631586
12 2 3 -0.19988983 1.27226678
13 3 3 -0.64511396 -0.71846622
14 4 3 0.16532102 -0.45033862
15 5 3 0.43881870 2.39745248
16 1 4 0.88330282 0.01112919
17 2 4 -2.05233698 1.63356842
18 3 4 -1.63637927 -1.43850664
19 4 4 1.43040234 -0.19051680
20 5 4 1.04662885 0.37842390
根据ID随机添加和删除一些数据后,数据集如下所示,观察总数应与上述一致:
id cluster x y
1 1 1 0.895 -0.659
2 2 1 -0.160 -0.366
3 1 2 -0.528 -0.294
4 2 2 -0.919 0.362
5 3 2 -0.901 -0.467
6 1 3 0.275 0.134
7 2 3 0.423 0.534
8 3 3 0.929 -0.953
9 4 3 1.67 0.668
10 5 3 0.286 0.0872
11 1 4 -0.373 -0.109
12 2 4 0.289 0.299
13 3 4 -1.43 -0.677
14 4 4 -0.884 1.70
15 5 4 1.12 0.386
16 1 5 -0.723 0.247
17 2 5 0.463 -2.59
18 3 5 0.234 0.893
19 4 5 -0.313 -1.96
20 5 5 0.848 -0.0613
【问题讨论】:
标签: data-manipulation data-generation