【发布时间】:2018-04-25 05:31:02
【问题描述】:
我有一个这样的列表“r”:
[["", 1], ["this is a text line", 2], ["this is a text line", 3], ["this is a text line", 4], ["", 5], ["", 6], ["this is a text line", 7],["this is a text line", 8], ["this is a text line", 9], ["this is a text line", 10], ["", 11], ["this is a text line", 12], ["this is a text line", 13], ["this is a text line", 14], ["", 15], ["this is a text line", 16], ["this is a text line", 17], ["this is a text line", 18], ["", 19]]
要知道我的空行和带有文本的行在哪里,我过滤了我的列表:
empty = [x[1] for x in r if regex.search("^\s*$", x[0])]
text = [x[1] for x in r if regex.search("\S", x[0])]
输出:
empty = [1, 5, 6, 11, 15, 19]
text= [2, 3, 4, 7, 8, 9, 10, 12, 13, 14, 16, 17, 18]
我想要做的是组合文本中的数字,如果它们是按顺序排列的 (text[i]-text[i+1]) = +1(为了定义段落):
finaltext = [[2, 3, 4], [7, 8, 9, 10], [12, 13, 14], [16, 17, 18]]
finaltext including empty = [[2, 3, 4, 5, 6], [7, 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]
如何根据条件对列表中的元素进行分组?
【问题讨论】:
-
您需要
finaltext和finaltext_including_empty还是只需要最后一个?为什么finaltext_including_empty的第一个子列表不是以1开头的? -
我需要两个列表。第一个子列表不以 1 开头,因为我不考虑文本开头的空行。
标签: python regex python-3.x list