【发布时间】:2015-09-05 10:39:34
【问题描述】:
我有一系列字符串如下:
x <- " 20 to 80% of the sward should be between 3 and 10cm tall,
with 20 to 80% of the sward between 10 and 30cm tall"
我想提取数值并保留单位,我尝试了以下方法:
x <- lapply(x, function(x){gsub("[^\\d |cm\\b |mm\\b |% ]", "", x, perl = T)})
这给出了:
" 20 80% 3 10cm 20 80% 10 30cm "
我需要的是:
"20 80%" "3 10cm" "20 80%" "10 30cm"
感谢阅读
【问题讨论】:
-
范围之间是否总是存在
and或to? -
试试
library(stringr);do.call(rbind,lapply(str_extract_all(x, '\\d+(\\s+|cm\\b|%)'), function(x) {m1 <- matrix(x, ncol=2, byrow=TRUE); paste(m1[,1], m1[,2])}))