【发布时间】:2020-12-03 05:15:29
【问题描述】:
好的,所以这个有点长,我有几个巨大的数据框,我正在尝试使其更宽并最终合并。我想按年份和县进行合并和分组。
我有几个专栏,其中包含我试图传播的因素。本质上,我想采用因子 x、y、z 并将它们设为列 x、y 和 z。 下面有一个例子。此外,我有几列是数字的,我想按组求和。
我试图提供一个示例和一些可重现的代码,希望这已经足够了,但是如果有什么我可以做的让事情变得更容易/更清晰,请告诉我,非常感谢你的帮助!
YR<-as.factor( c(2019,2018,2019,2019,2018,2018,2019,2019,2018))
STATE<-as.factor( c("CA","MA","KY","KY","CA","MA","KY","KY","CA"))
COUNTY<-as.factor( c("C1","M1","K1","K2","C1","M2","K1","K2","C1"))
CANCER<-as.factor(c("Cervical","Lung","Prostate","Breast","Cervical","Breast","Prostate","Prostate","Lung"))
rand_fact<-as.factor(c("rf1","rf2","rf3","fr4","fr5","rf2","rf3","fr4","fr5"))
rand_num<-as.numeric(c(4,3,5,7,3,5,3,24,9))
rand_chr<-as.character(c("a","d","r","e","g","y","r","e","k"))
TEST_DR<-data.frame(YR,STATE,COUNTY,CANCER,rand_fact,rand_num,rand_chr)
rm(YR,STATE,COUNTY,CANCER,rand_chr,rand_num,rand_fact)
> print(TEST_DR)
YR STATE COUNTY CANCER rand_fact rand_num rand_chr
1 2018 CA C1 Cervical fr5 3 g
2 2018 CA C1 Lung fr5 9 k
3 2018 MA M1 Lung rf2 3 d
4 2018 MA M2 Breast rf2 5 y
5 2019 CA C1 Cervical rf1 4 a
6 2019 KY K1 Prostate rf3 5 r
7 2019 KY K1 Prostate rf3 3 r
8 2019 KY K2 Breast fr4 7 e
9 2019 KY K2 Prostate fr4 24 e
#Idealy the output will look like below with rows grouped by YR then COUNTY
TEST_DR<-arrange(.data = TEST_DR,YR,COUNTY)
YR<-as.factor( c(2018,2018,2018,2019,2019,2019))
STATE<-as.factor( c("CA","MA","MA","CA","KY","KY"))
COUNTY<-as.factor( c("C1","M1","M2","C1","K1","K2"))
Cervical<-as.numeric(c(1,0,0,1,0,0))
Lung <-as.numeric(c(1,1,0,0,0,0))
Prostate<-as.numeric(c(0,0,0,0,2,1))
Breast<-as.numeric(c(0,0,1,0,0,1))
TEST_DR2 <-data.frame(YR,STATE,COUNTY,Cervical,Lung,Prostate,Breast)
rm(YR,STATE,COUNTY,Cervical,Lung,Prostate,Breast)
> print(TEST_DR2)
YR STATE COUNTY Cervical Lung Prostate Breast rand_num
1 2018 CA C1 1 1 0 0 12
2 2018 MA M1 0 1 0 0 3
3 2018 MA M2 0 0 0 1 5
4 2019 CA C1 1 0 0 0 4
5 2019 KY K1 0 0 2 0 8
6 2019 KY K2 0 0 1 1 31
【问题讨论】: