【发布时间】:2014-05-08 07:18:06
【问题描述】:
我在使用正则表达式时收到“线程“主”java.lang.StackOverflowError 中的异常”:
(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)")
对于一个长字符串。实际上,我想根据 .csv 文件中的 ','(在 .csv 文件中的 "" 之外)拆分字符串。它对 450 列工作正常,但对更多列给出错误,如下所示---
Exception in thread "main" java.lang.StackOverflowError
at java.util.regex.Pattern$Curly.match0(Pattern.java:4148)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
at java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
【问题讨论】:
-
您可能是灾难性回溯的受害者。请参阅here 了解匹配“双引号、\”转义\“字符串”的好方法
-
另外,既然这似乎是 CSV,为什么不使用 OpenCSV?
-
很遗憾没有一个好的 CSV 库可用(据我所知)。这个问题源于人们应该如何解释 CSV 文件。首先,您可以指定分隔符
','和引号字符'"'。引号字符围绕字段,例如"Hello, my", name, " is ", jared将创建以下字段:{"Hello, my", "name", " is ", "jared"}。您可以遍历行的字符以生成这些字段(逐行、逐字符)。当引号字符需要成为字段的一部分时,它就会成为一个问题!