试试这个:
def group_by_sep(items, sep='|'):
inner_list = []
for item in items:
if item == sep:
yield inner_list
inner_list = []
else:
inner_list.append(item)
if inner_list:
yield inner_list
Data=['Label',23,'NORM','|','RESP',1.256,None,'|','','','|','RELV','','','|','|','now','|']
SubList = list(group_by_sep(Data, '|'))
print(SubList)
# [['Label', 23, 'NORM'], ['RESP', 1.256, None], ['', ''], ['RELV', '', ''], [], ['now']]
请注意,此处可以使用itertools.groupby 方法,但它不等同于上述方法,并且对确切行为的控制较少:
import itertools
def group_by_sep2(items, sep='|'):
yield from (
list(g)
for k, g in itertools.groupby(items, key=lambda x: x == sep)
if not k)
SubList2 = list(group_by_sep2(Data, '|'))
print(SubList2)
# [['Label', 23, 'NORM'], ['RESP', 1.256, None], ['', ''], ['RELV', '', ''], ['now']]
在两个连续的分隔符之间缺少空的list。
另外,它不如上面的直接方法有效:
%timeit list(group_by_sep(Data))
# 1000 loops, best of 3: 1.47 µs per loop
%timeit list(group_by_sep2(Data))
# 100 loops, best of 3: 4.01 µs per loop
%timeit list(group_by_sep(Data * 1000))
# 1000 loops, best of 3: 1.33 ms per loop
%timeit list(group_by_sep2(Data * 1000))
# 100 loops, best of 3: 2.83 ms per loop
%timeit list(group_by_sep(Data * 1000000))
# 1000 loops, best of 3: 1.67 s per loop
%timeit list(group_by_sep2(Data * 1000000))
# 100 loops, best of 3: 3.22 s per loop
基准测试表明直接方法的速度快了 ~2 到 ~3 倍。
(已编辑以将其全部编写为生成器并包含更多边缘情况)