如何在字符串第二次出现之前拆分字符串？答案

【问题标题】：how can i split a string before the second occurance of a string?如何在字符串第二次出现之前拆分字符串？
【发布时间】：2017-10-13 23:53:33
【问题描述】：

假设我有字符串NYKMIANYKCLE，我想将它拆分为一个列表，其中只有NYKMIA 和NYKCLE（在'NYK' 第二次出现之前拆分）。有没有办法在python中做到这一点？

【问题讨论】：

标签： python python-3.x

【解决方案1】：

您可以使用re.findall 查找所有以NYK 开头、不包含另一个NYK 或后跟字符串结尾字符的子字符串：

>>> s = 'NYKMIANYKCLE'
>>> import re
>>> re.findall(r'NYK.+?(?=NYK|$)', s)
['NYKMIA', 'NYKCLE']

第一个? 确保搜索是非贪婪的；一次一个子字符串，而(?=NYK|$) 强制断言该子字符串在下一个NYK... 子字符串或字符串结尾字符$ 之前。

更多测试：

>>> s = 'NYKMIANYKCLENYKjahsja'
>>> re.findall(r'NYK.+?(?=NYK|$)', s)
['NYKMIA', 'NYKCLE', 'NYKjahsja']

【讨论】：

【解决方案2】：

你可以试试这样的：

string = 'NYKMIANYKCLE'
substring = 'NYK'

first_index = string.index(substring)
second_index = string.index(substring, first_index + len(substring))
print string[:second_index], string[second_index:]

【讨论】：

【解决方案3】：

由于问题是关于拆分的，这可以通过新的regex module 来完成，允许零宽度字符进行分割

import regex
s='NYKMIANYKCLE'
print(regex.split('(?V1)(?=NYK)',s))

输出

['', 'NYKMIA', 'NYKCLE']

更新

避免在行首分割

print(regex.split('(?V1)[^^](?=NYK)',s))

输出

['NYKMI', 'NYKCLE']

解释

(?V1)      #Forces new version 2 of split which allows zero width chars for split
[^^]       #don't take line beginning as split
(?=NYK)    #take a position as split if the position is followed by NYK

【讨论】：