【发布时间】:2022-01-16 12:20:22
【问题描述】:
我正在清理一些调查数据,这些数据似乎允许受访者选择多个种族类别。我想知道如何将这些重新编码为“多种族”响应以进行分析。
现在我一直在做相当费力的手工编码,但还没有成功。这是我尝试使用 recode 将具有多个条目的每个响应转换为一个数字,然后可以使用 case_when 对其进行重新编码。
rawdat$race <- recode(rawdat$race, "White, non-Hispanic,Asian" = 1,
"White, non-Hispanic,American Indian or Alaska Native" = 2,
"White, non-Hispanic,Black or African American,Asian" = 3,
"Black or African American,American Indian or Alaska Native" = 4,
"White, non-Hispanic,Hispanic" = 5,
"Asian,Native Hawaiian or Pacific Islander" = 6,
"White, non-Hispanic,Black or African American" = 7,
"Black or African American,American Indian or Alaska Native,Asian,Hispanic" = 8,
"White, non-Hispanic,Black or African American,American Indian or Alaska Native,Asian,Native Hawaiian or Pacific Islander,Hispanic" = 9,
"Black or African American,Hispanic" = 10,
"Black or African American,Asian" = 11,
"White, non-Hispanic,Native Hawaiian or Pacific Islander" =12,
"White, non-Hispanic,Black or African American,American Indian or Alaska Native,Asian,Hispanic",
"American Indian or Alaska Native,Hispanic" = 13)
这种方法有很多问题(我只是尝试过,因为我认为它可以作为强力的短期修复 - 它没有),我更喜欢初始化一个向量包含针对此问题呈现给受访者的每个可能值,然后将包含多个这些值的任何单元格重新编码为值“多种族”,但据我所知,recode() 函数不会接受这样的向量作为论据。关于如何完成后一种方法的任何想法?
【问题讨论】:
标签: r dplyr data-wrangling recode