【发布时间】:2020-04-15 20:27:41
【问题描述】:
我有附加的数据框。
数据
structure(list(associated_gene = c(NA, NA, "A4GALT", NA, NA,
"NOT FOUND"), chr_name = c("22", "22", "22", "22", "22", "NOT FOUND"
), chrom_start = c(42693910L, 42693843L, 42693321L, 42693665L,
42693653L, 0L), allele = c("G/A/T", "T/C", "G/C", "C/T", "G/A/T",
"NOT FOUND"), refsnp_id = c("rs778598915", "rs11541159", "rs397514502",
"rs762949801", "rs776304817", "NOT FOUND")), row.names = c("s3a",
"s3b", "s3c", "s3d", "s3e", "s3f"), class = "data.frame")
associated_gene chr_name chrom_start allele refsnp_id s3a <NA> 22 42693910 G/A/T rs778598915 s3b <NA> 22 42693843 T/C rs11541159 s3c A4GALT 22 42693321 G/C rs397514502 s3d <NA> 22 42693665 C/T rs762949801 s3e <NA> 22 42693653 G/A/T rs776304817 s3f NOT FOUND NOT FOUND 0 NOT FOUND NOT FOUND
我想将第一个“/”的等位基因列分成两部分(Ref & Var),并将它们插入到 $chrom_start 和 $refsnp_id 之间
理想的输出是:
associated_gene chr_name chrom_start Ref Var refsnp_id s3a <NA> 22 42693910 G A/T rs778598915 s3b <NA> 22 42693843 T C rs11541159
我不知道我是否可以加载 awk,但在 bash 中我会这样做:
猫等位基因 | awk -F"/" '{打印 $1 "\t" $2}'
【问题讨论】:
标签: r string dataframe dplyr stringr