我不会返回 0,因为它可能是起始索引,使用 -1、None 或其他一些不可能的值,您可以简单地使用 try/except 并返回索引:
def get_ind(s, targ):
s = s.lower()
for t in targets:
try:
return s.index(t.lower())
except ValueError:
pass
return None # -1, False ...
如果您还想忽略输入字符串的大小写,则在循环之前设置s = s.lower()。
你也可以这样做:
def get_ind_next(s, targ):
s = s.lower()
return next((s.index(t) for t in map(str.lower,targ) if t in s), None)
但最坏的情况是对每个子字符串进行两次查找,而不是使用 try/except 进行一次查找。它至少也会在第一场比赛中短路。
如果你真的想要所有的最小值,那么改为:
def get_ind(s, targ):
s = s.lower()
mn = float("inf")
for t in targ:
try:
i = s.index(t.lower())
if i < mn:
mn = i
except ValueError:
pass
return mn
def get_ind_next(s, targ):
s = s.lower()
return min((s.index(t) for t in map(str.lower, targ) if t in s), default=None)
default=None 仅适用于 python >= 3.4,因此如果您使用的是 python2,那么您将不得不稍微更改逻辑。
时序python3:
In [29]: s = "hello world" * 5000
In [30]: s += "grea" + s
In [25]: %%timeit
....: targ = [re.escape(x) for x in targets]
....: pattern = r"%(pattern)s" % {'pattern' : "|".join(targ)}
....: firstMatch = next(re.finditer(pattern, s, re.IGNORECASE),None)
....: if firstMatch:
....: pass
....:
100 loops, best of 3: 5.11 ms per loop
In [18]: timeit get_ind_next(s, targets)
1000 loops, best of 3: 691 µs per loop
In [19]: timeit get_ind(s, targets)
1000 loops, best of 3: 627 µs per loop
In [20]: timeit min([s.lower().find(x.lower()) for x in targets if x.lower() in s.lower()] or [0])
1000 loops, best of 3: 1.03 ms per loop
In [21]: s = 'Iamfoothegreat'
In [22]: targets = ['bar', 'grea', 'other','foo']
In [23]: get_ind_next(s, targets) == get_ind(s, targets) == min([s.lower().find(x.lower()) for x in targets if x.lower() in s.lower()] or [0])
Out[24]: True
Python2:
In [13]: s = "hello world" * 5000
In [14]: s += "grea" + s
In [15]: targets = ['foo', 'bar', 'grea', 'other']
In [16]: timeit get_ind(s, targets)1000 loops,
best of 3: 322 µs per loop
In [17]: timeit min([s.lower().find(x.lower()) for x in targets if x.lower() in s.lower()] or [0])
1000 loops, best of 3: 710 µs per loop
In [18]: get_ind(s, targets) == min([s.lower().find(x.lower()) for x in targets if x.lower() in s.lower()] or [0])
Out[18]: True
您也可以将第一个与 min 结合起来:
def get_ind(s, targ):
s,mn = s.lower(), None
for t in targ:
try:
mn = s.index(t.lower())
yield mn
except ValueError:
pass
yield mn
它做同样的工作,只是更好一点,也许稍微快一点:
In [45]: min(get_ind(s, targets))
Out[45]: 55000
In [46]: timeit min(get_ind(s, targets))
1000 loops, best of 3: 317 µs per loop