已更新 - 基于 cmets 中显示的 OP 新模式的模式匹配。在这里,我们使用str_extract 来提取开括号 (regex lookaround) 或 (|) 任何非括号字符 ([^()]) 后面的一个或多个数字
library(stringr)
str_extract_all(str1, "(?<=[(])\\d+|[^()]")
[[1]]
[1] "2" "10" "1" "12"
[[2]]
[1] "2" "0" "6" "9"
[[3]]
[1] "2" "15"
[[4]]
[1] "2" "1" "3" "1"
-测试 OP 的额外模式
str_extract_all(str2, "(?<=[(])\\d+|[^()]")
[[1]]
[1] "2" "10" "1" "12"
[[2]]
[1] "2" "0" "6" "9"
[[3]]
[1] "2" "15"
[[4]]
[1] "2" "1" "3" "1"
[[5]]
[1] "10" "0" "2" "0" "1"
-早期的解决方案(基于假设所有大于9的数字都将被括在括号内)
我们可以在base R中的括号内拆分
unlist(strsplit(str1[1], "\\(|\\)"))
[1] "2" "10" "1" "12"
假设如果有这两种情况,那么一个选项是获取那些元素的索引有括号并单独执行此操作
i1 <- grepl("\\(|\\)", str1)
lst1 <- vector('list', length(str1))
lst1[i1] <- strsplit(str1[i1], "\\(|\\)")
lst1[!i1] <- strsplit(str1[!i1], "")
unlist(lst1)
[1] "2" "10" "1" "12" "2" "0" "6" "9" "2" "15" "2" "1" "3" "1"
或者另一个选项是ifelse 和grepl 来创建单个分隔符,然后使用strsplit
lst1 <- strsplit(trimws(ifelse(grepl("\\(|\\)", str1),
gsub("\\(|\\)", ",", str1), gsub("(?<=.)(?=.)", "\\1,\\2",
str1, perl = TRUE)), whitespace = ","), ",")
lst1
[[1]]
[1] "2" "10" "1" "12"
[[2]]
[1] "2" "0" "6" "9"
[[3]]
[1] "2" "15"
[[4]]
[1] "2" "1" "3" "1"
数据
str1 <- c("2(10)1(12)", "2069", "2(15)", "2131")
str2 <- c(str1, "(10)0201")