在特定字符串后提取数字答案

【问题标题】：extract number after specific string在特定字符串后提取数字
【发布时间】：2016-03-11 18:13:07
【问题描述】：

我需要找到字符串“Count of”之后的数字。 “Count of”字符串和数字之间可能有空格或符号。我有一些适用于 www.regex101.com 但不适用于 stringr str_extract 函数的东西。

library(stringr)

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "count of ([\\d]+)")
[1] NA NA NA NA "count of 5" "count of 50" NA

我想得到什么：

[1] NA NA NA NA "5" "50" "10"

【问题讨论】：

标签： regex r stringr

【解决方案1】：

str_extract(shopping_list, "(?i)(?<=count of\\D)\\d+")
# [1] NA   NA   NA   NA   "5"  "50" "10"

其中(?i) 使模式不区分大小写，\\D 表示不是数字，?<= 是正向回溯。

【讨论】：

我在考虑向后看，但如果数据有细微的变化，它就会失败。试试"coconut count of - 5"
@PierreLafortune，是的，但在这种情况下，我知道“f”和数字之间只能有一个符号。
这是只使用正则表达式提供我需要的值的查询。谢谢！
@Julius 是否可以将\\D 设为任意数量的非数字、非字符变量？
@MatthewCrews，不幸的是，据我所知，正向后视只允许预定义长度的文本，所以不能（如果我们想保持与此答案相同的精神）。这就是 Pierre Lafortune 的想法。

【解决方案2】：

向前看和向后看是你正在寻找的这个 grep...

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "(?<=count of )[0-9]*")
[1] NA   NA   NA   NA   "5"  "50" NA

【讨论】：

【解决方案3】：

as.numeric(sub("(?i).*count of.*?(\\d+).*", "\\1", shopping_list))
[1] NA NA NA NA  5 50 10

正则表达式模式是：

(?i): 忽略大小写
.*count of.*?：不超过“count of”的任意长度的字符
(\\d+)：捕获一位或多位数字
"\\1"：返回捕获组

到目前为止，其他答案将因""coconut count of - 5" 之类的内容而失败，因为它们在“计数”之后受到一个空格的限制。

【讨论】：