【发布时间】:2020-02-12 08:12:35
【问题描述】:
我想在下面的数据框中对“代码”列的前 5 个出现/频率值进行编码:
ID | Code
1 | A
2 | A
3 | A
4 | F
5 | F
6 | C
7 | C
8 | E
9 | E
10 | D
10 | D
11 | B
12 | G
13 | H
结果应该是这样的:
ID | A | F | C | E | D |
1 | 1 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 0 | 0 | 0 |
3 | 1 | 0 | 0 | 0 | 0 |
4 | 0 | 1 | 0 | 0 | 0 |
5 | 0 | 1 | 0 | 0 | 0 |
6 | 0 | 0 | 1 | 0 | 0 |
7 | 0 | 0 | 1 | 0 | 0 |
8 | 0 | 0 | 0 | 1 | 0 |
9 | 0 | 0 | 0 | 1 | 0 |
10 | 0 | 0 | 0 | 0 | 1 |
11 | 0 | 0 | 0 | 0 | 0 |
12 | 0 | 0 | 0 | 0 | 0 |
13 | 0 | 0 | 0 | 0 | 0 |
如何在 R 中使用 top_n 和 dcast 函数?或 dplyr。
【问题讨论】: