Python curses 在添加 utf-8 编码字符串时打印两个字符答案

【问题标题】：Python curses prints two characters when adding a utf-8 encoded stringPython curses 在添加 utf-8 编码字符串时打印两个字符
【发布时间】：2014-03-25 18:30:02
【问题描述】：

在尝试将 UTF-8 编码的字符串打印到 curses 窗口时，我遇到了一个非常奇怪的问题。这是代码，我将在下面讨论确切的问题以及我尝试过的事情。

# coding=UTF-8
import curses
import locale
import time
locale.setlocale(locale.LC_ALL, '')
code = locale.getpreferredencoding()



class AddCharCommand(object):
    def __init__(self, window, line_start, y, x, character):
        """
        Command class for adding the specified character, to the specified
        window, at the specified coordinates.
        """
        self.window = window
        self.line_start = line_start
        self.x = x
        self.y = y
        self.character = character


    def write(self):
        if self.character > 127:
            # curses somehow returns a keycode that is 64 lower than what it
            # should be, this takes care of the problem.
            self.character += 64
            self.string = unichr(self.character).encode(code)
            self.window.addstr(self.y, self.x, self.string)
        else:
             self.window.addch(self.y, self.x, self.character)


    def delete(self):
        """
        Erase characters usually print two characters to the curses window.
        As such both the character at these coordinates and the one next to it
        (that is the one self.x + 1) must be replaced with the a blank space.
        Move to cursor the original coordinates when done.
        """
        for i in xrange(2):
            self.window.addch(self.y, self.x + i, ord(' '))
        self.window.move(self.y, self.x)

def main(screen):
    maxy, maxx = screen.getmaxyx()
    q = 0
    commands = list()
    x = 0
    erase = ord(curses.erasechar())
    while q != 27:
        q = screen.getch()
        if q == erase:
            command = commands.pop(-1).delete()
            x -= 1
            continue
        command = AddCharCommand(screen, 0, maxy/2, x, q)
        commands.append(command)
        command.write()
        x += 1

curses.wrapper(main)

这是一个Gist link。

问题是当我按下è 键（ASCII 码为 232）时，它不会只打印那个字符。相反，字符串ăè 被打印到给定的坐标。我曾尝试使用self.window.addstr(self.x, self.y, self.string[1])，但这只会导致打印乱码。

然后我启动了一个 Python 提示符来查看 unichr(232).encode('utf-8') 的返回值，它确实是一个长度为 2 的字符串。

非常出乎意料的行为是，如果我在main 中输入screen.addstr(4, 4, unichr(232).encode(code))，它将正确显示è 字符，并且仅显示该字符。如果我让AddCharCommand 类的write 方法无论如何都打印è 字符，情况也是如此。

当然，问题不仅限于è，几乎所有扩展ASCII 字符都是这种情况。

我知道带有诅咒的扩展 ASCII 有点不稳定，但我根本无法理解这种行为。如果我硬编码 ASCII 代码，代码按预期工作没有任何意义（对我来说），但如果我不这样做，它会添加另一个字符。

我环顾四周并阅读了很多关于诅咒的内容，但我无法找到解决此问题的方法。我将非常感谢您对此事的任何帮助，这让我发疯了。

也许不那么重要，但如果有人能向我解释为什么 screen.getch() 会为 127 以上的字符返回不正确的 ASCII 码，以及为什么真正的 ASCII 码与 curses 返回的码之间的差异是 64，我会很高兴.

非常感谢您。

【问题讨论】：

附带说明：没有 ASCII 232。ASCII 是从 0 到 127 的 7 位。
我认为我的意思是扩展 ASCII。
有 80 亿个扩展。基本问题是unichr(232) 毫无疑问是u'\xe8'，但是当你在上面调用encode 时，它会产生unichr(232).encode('utf-8') == '\xc3\xa8' (è) 这肯定是（a）正确的，（b）不是你想要的。显然，显示字符串的部分需要 ASCII，因为 'è' 不是 ASCII...
很抱歉，您读过全文了吗？ screen.addstr(1, 1, unichr(232).encode('utf-8')) 正确添加了字符。我没有收到错误，因为它不是 ASCII。
Python 错误（已修复）：curses implementation of Unicode is wrong in Python 3 可能会给您一些如何处理è 的想法

标签： python encoding character-encoding ncurses python-curses

【解决方案1】：

对我来说很好用：

c=screen.get_wch()
screen.addch(c)

【讨论】：