如何将unicode字符串拆分为列表[重复]答案

【问题标题】：how to split a unicode string into list [duplicate]如何将unicode字符串拆分为列表[重复]
【发布时间】：2013-09-13 17:14:26
【问题描述】：

我有以下代码：

stru = "۰۱۲۳۴۵۶۷۸۹"
strlist = stru.decode("utf-8").split()
print strlist[0]

我的输出是：

۰۱۲۳۴۵۶۷۸۹

但是当我使用时：

print strlist[1]

我收到以下traceback：

IndexError: list index out of range

我的问题是，我怎样才能split 我的string？当然，记得我是从function 得到我的string，认为它是variable？

【问题讨论】：

【解决方案1】：

你不需要。

>>> print u"۰۱۲۳۴۵۶۷۸۹"[1]
۱

如果你仍然想要...

>>> list(u"۰۱۲۳۴۵۶۷۸۹")
[u'\u06f0', u'\u06f1', u'\u06f2', u'\u06f3', u'\u06f4', u'\u06f5', u'\u06f6', u'\u06f7', u'\u06f8', u'\u06f9']

【讨论】：

【解决方案2】：

你可以这样做

list(stru.decode("utf-8"))

【讨论】：

【解决方案3】：

split() 方法默认分割为空白。因此，strlist 是一个列表，包含strlist[0] 中的整个字符串和一个元素。

如果您想要一个包含每个 unicode 代码点的元素的列表，您可以通过不同的方式将其转换为列表：

功能：list(stru.decode("utf-8"))
列表压缩：[item for item in stru.decode("utf-8")]
根本不转换。你真的需要一份清单吗？您可以像遍历任何其他序列类型一样遍历 unicode 字符串 (for character in stru.decode("utf-8"): ...)

【讨论】：