你的list没有被解析的原因在于这个表达式:
element = word | obj | list
因为您在list 之前检查word(这真的很糟糕
在 Python 中工作时的变量名,顺便说一句),然后是前面的“foo”
"foo,bar" 被处理为 word,因为 '|'是一个急切的操作员,
匹配第一个匹配表达式。
您可以通过更改element 中的表达式顺序来解决此问题:
element = list | word | obj
或者使用“^”而不是“|”。 '^' 是一个耐心的操作员 - 它评估
所有替代表达式并选择最长的匹配项。
element = word ^ obj ^ list
通过这些更改中的任何一个,您的输出现在变为:
word
list
word
list
obj
word
word
list
为什么所有的列表都匹配?因为delimitedList 会匹配单个项目:
>>> wd = Word(alphas)
>>> wdlist = delimitedList(wd)
>>> print(wdlist.parseString('xyz'))
['xyz']
如果您想强制列表必须包含 > 1 个项目,那么您可以添加
条件解析动作:
>>> wdlist.addCondition(lambda t: len(t)>1)
>>> print(wdlist.parseString('xyz'))
... raises exception ...
此外,delimitedLists 不会自动对它们的结果进行分组:
>>> print((wd + wdlist).parseString('xyz abc,def'))
['xyz', 'abc', 'def']
如果要将列表内容作为列表保留在结果中,则换行
Group 中的列表表达式:
>>> print((wd + Group(wdlist)).parseString('xyz abc,def'))
['xyz', ['abc', 'def']]
这是我的 process() 方法的更新版本:
def process(string):
print(string)
word = ~Literal('OBJ') + Word(alphas.lower())
word.addParseAction(lambda s,l,t: found_word(s, l, t))
word.setName("word")
obj = Literal('OBJ') + Word(alphas.lower())
obj.setName("obj")
obj.addParseAction(lambda s,l,t: found_obj(s, l, t))
item = word | obj
list = Group(pyparsing.delimitedList(item, delim=',')
.addCondition(lambda t: len(t)>1))
list.setName("list")
list.addParseAction(lambda s,l,t: found_list(s, l, t))
element = obj | list | word
parser = pyparsing.OneOrMore(element)
parser.searchString(string).pprint()
给出这个输出:
foo bar OBJ baz foo,bar
word
word
word
word
obj
word
word
list
[['foo', 'bar', 'OBJ', 'baz', ['foo', 'bar']]]
您会注意到我为您的每个表达式添加了setName() 调用。那
这样我就可以添加setDebug() 来获得pyparsing 的调试输出。通过添加:
word.setDebug()
obj.setDebug()
list.setDebug()
在调用parseString 之前,你会得到这个调试输出。它可能有助于解释
为什么您会在示例输出中获得复制的“单词”。
foo bar OBJ baz foo,bar
Match obj at loc 0(1,1)
Exception raised:Expected "OBJ", found 'f' (at char 0), (line:1, col:1)
Match list at loc 0(1,1)
Match word at loc 0(1,1)
word
Matched word -> ['foo']
Exception raised:failed user-defined condition, found 'f' (at char 0), (line:1, col:1)
Match word at loc 0(1,1)
word
Matched word -> ['foo']
Match obj at loc 3(1,4)
Exception raised:Expected "OBJ", found 'b' (at char 4), (line:1, col:5)
Match list at loc 3(1,4)
Match word at loc 4(1,5)
word
Matched word -> ['bar']
Exception raised:failed user-defined condition, found 'b' (at char 4), (line:1, col:5)
Match word at loc 3(1,4)
word
Matched word -> ['bar']
Match obj at loc 7(1,8)
obj
Matched obj -> ['OBJ', 'baz']
Match obj at loc 15(1,16)
Exception raised:Expected "OBJ", found 'f' (at char 16), (line:1, col:17)
Match list at loc 15(1,16)
Match word at loc 16(1,17)
word
Matched word -> ['foo']
Match word at loc 20(1,21)
word
Matched word -> ['bar']
list
Matched list -> [['foo', 'bar']]
Match obj at loc 23(1,24)
Exception raised:Expected "OBJ", found end of text (at char 23), (line:1, col:24)
Match list at loc 23(1,24)
Match word at loc 23(1,24)
Exception raised:Expected W:(abcd...), found end of text (at char 23), (line:1, col:24)
Match obj at loc 23(1,24)
Exception raised:Expected "OBJ", found end of text (at char 23), (line:1, col:24)
Exception raised:Expected {word | obj}, found end of text (at char 23), (line:1, col:24)
Match word at loc 23(1,24)
Exception raised:Expected W:(abcd...), found end of text (at char 23), (line:1, col:24)
Match obj at loc 23(1,24)
Exception raised:Expected "OBJ", found end of text (at char 23), (line:1, col:24)
Match list at loc 23(1,24)
Match word at loc 23(1,24)
Exception raised:Expected W:(abcd...), found end of text (at char 23), (line:1, col:24)
Match obj at loc 23(1,24)
Exception raised:Expected "OBJ", found end of text (at char 23), (line:1, col:24)
Exception raised:Expected {word | obj}, found end of text (at char 23), (line:1, col:24)
Match word at loc 23(1,24)
Exception raised:Expected W:(abcd...), found end of text (at char 23), (line:1, col:24)
[['foo', 'bar', 'OBJ', 'baz', ['foo', 'bar']]]