在Python中一次迭代一个字符串2（或n）个字符答案

【问题标题】：Iterate over a string 2 (or n) characters at a time in Python在Python中一次迭代一个字符串2（或n）个字符
【发布时间】：2010-11-12 20:27:07
【问题描述】：

今天早些时候，我需要一次遍历 2 个字符的字符串来解析格式为 "+c-R+D-E" 的字符串（还有一些额外的字母）。

我最终得到了这个，它有效，但看起来很难看。我最终评论了它在做什么，因为它感觉不明显。它几乎看起来像pythonic，但并不完全。

# Might not be exact, but you get the idea, use the step
# parameter of range() and slicing to grab 2 chars at a time
s = "+c-R+D-e"
for op, code in (s[i:i+2] for i in range(0, len(s), 2)):
  print op, code

有没有更好/更清洁的方法来做到这一点？

【问题讨论】：

@Richard，你可能错过了第 2 行的“)”吗？
What is the most "pythonic" way to iterate over a list in chunks? 的可能重复项

标签： python iteration

【解决方案1】：

我不知道清洁剂，但还有另一种选择：

for (op, code) in zip(s[0::2], s[1::2]):
    print op, code

无副本版本：

from itertools import izip, islice
for (op, code) in izip(islice(s, 0, None, 2), islice(s, 1, None, 2)):
    print op, code

【讨论】：

我真的很喜欢这个......我只是希望它没有复制来迭代。
如果字符串有奇数个字符，zip 方法会跳过最后一个字符。
对于 python3，“无复制版本”不是必需的（实际上不再有效）。见stackoverflow.com/questions/32659552/…
切片在 Python 3 中仍然是副本，所以你仍然需要islice。

【解决方案2】：

也许这会更干净？

s = "+c-R+D-e"
for i in xrange(0, len(s), 2):
    op, code = s[i:i+2]
    print op, code

你也许可以写一个生成器来做你想做的事，也许那会更 Pythonic :)

【讨论】：

+1 简单，它适用于任何 n（如果在 len(s) 不是 n 的倍数时处理 ValueError 异常。

【解决方案3】：

Triptych 启发了这个更通用的解决方案：

def slicen(s, n, truncate=False):
    assert n > 0
    while len(s) >= n:
        yield s[:n]
        s = s[n:]
    if len(s) and not truncate:
        yield s

for op, code in slicen("+c-R+D-e", 2):
    print op,code

【讨论】：

【解决方案4】：

from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return izip_longest(*args, fillvalue=fillvalue)
def main():
    s = "+c-R+D-e"
    for item in grouper(s, 2):
        print ' '.join(item)
if __name__ == "__main__":
    main()
##output
##+ c
##- R
##+ D
##- e

izip_longest 需要 Python 2.6（或更高版本）。如果在 Python 2.4 或 2.5 上，请使用 document 中的 izip_longest 定义或将 grouper 函数更改为：

from itertools import izip, chain, repeat
def grouper(iterable, n, padvalue=None):
    return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

【讨论】：

最佳答案，除了它被重命名为zip_longestin Python3。

【解决方案5】：

发电机的绝佳机会。对于较大的列表，这将比压缩所有其他元素更有效。请注意，此版本还可以处理带有悬空ops 的字符串

def opcodes(s):
    while True:
        try:
            op   = s[0]
            code = s[1]
            s    = s[2:]
        except IndexError:
            return
        yield op,code        


for op,code in opcodes("+c-R+D-e"):
   print op,code

编辑：轻微重写以避免 ValueError 异常。

【讨论】：

一些边缘情况 - 总是引发 ValueError: try opcodes("a1")

【解决方案6】：

其他答案适用于 n = 2，但对于一般情况，您可以试试这个：

def slicen(s, n, truncate=False):
    nslices = len(s) / n
    if not truncate and (len(s) % n):
        nslices += 1
    return (s[i*n:n*(i+1)] for i in range(nslices))

>>> s = '+c-R+D-e'
>>> for op, code in slicen(s, 2):
...     print op, code
... 
+ c
- R
+ D
- e

>>> for a, b, c in slicen(s, 3):
...     print a, b, c
... 
+ c -
R + D
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: need more than 2 values to unpack

>>> for a, b, c in slicen(s,3,True):
...     print a, b, c
... 
+ c -
R + D

【讨论】：

【解决方案7】：

这种方法支持每个结果的任意数量的元素，延迟评估，输入迭代可以是生成器（不尝试索引）：

import itertools

def groups_of_n(n, iterable):
    c = itertools.count()
    for _, gen in itertools.groupby(iterable, lambda x: c.next() / n):
        yield gen

任何剩余的元素都会在一个较短的列表中返回。

示例用法：

for g in groups_of_n(4, xrange(21)):
    print list(g)

[0, 1, 2, 3]
[4, 5, 6, 7]
[8, 9, 10, 11]
[12, 13, 14, 15]
[16, 17, 18, 19]
[20]

【讨论】：

【解决方案8】：

考虑 pip 安装 more_itertools，它已经附带了 chunked 实现以及其他有用的工具：

import more_itertools 

for op, code in more_itertools.chunked(s, 2):
    print(op, code)

输出：

+ c
- R
+ D
- e

【讨论】：

【解决方案9】：

>>> s = "+c-R+D-e"
>>> s
'+c-R+D-e'
>>> s[::2]
'+-+-'
>>>

【讨论】：

【解决方案10】：

也许不是最有效的，但如果你喜欢正则表达式...

import re
s = "+c-R+D-e"
for op, code in re.findall('(.)(.)', s):
    print op, code

【讨论】：

【解决方案11】：

这是我的答案，我的眼睛更干净一点：

for i in range(0, len(string) - 1):
    if i % 2 == 0:
        print string[i:i+2]

【讨论】：

range 也支持一个步骤 ;) -- for i in range(0, len(str), 2): print str[i:i+2]

【解决方案12】：

我遇到了类似的问题。结束了这样的事情：

ops = iter("+c-R+D-e")
for op in ops
    code = ops.next()

    print op, code

我觉得它是最易读的。

【讨论】：

【解决方案13】：

我做了这个简单的生成器：

def every_two(s):
    d = list(s)
    c = True
    for i in range(len(d)):
        if c:
            c = False
            yield d[i], d[i+1]
        else:
            c = True

如果字符串的长度不能被 2 整除，则会引发 IndexError，但您可以将 yield 语句包装在 try 块中。

【讨论】：

感谢您在平台上分享您的答案。但是对于这个问题，一个适用于所有n 的答案会有所帮助
答案不正确