【发布时间】:2020-06-22 11:21:29
【问题描述】:
我尝试在 Windows 10 机器上的 MINGW64 Python3 上使用我在 https://superuser.com/questions/876572/how-do-i-find-out-which-font-contains-a-certain-special-character/1452828 上找到的这段代码:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import unicodedata
import os
from fontTools.ttLib import TTFont
fonts = []
for root,dirs,files in os.walk("c:/Windows/Fonts/"):
for file in files:
if file.endswith(".ttf"):
tfile = os.path.join(root,file)
fonts.append(tfile)
def char_in_font(unicode_char, font):
for cmap in font['cmap'].tables:
if cmap.isUnicode():
if ord(unicode_char) in cmap.cmap:
return True
return False
def test(char):
for fontpath in fonts:
font = TTFont(fontpath) # specify the path to the font in question
if char_in_font(char, font):
#print(char + " "+ unicodedata.name(char) + " in " + fontpath) # UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f63a' in position 0: character maps to <undefined>
#print( "{} ({}) in {}".format(char, unicodedata.name(char), fontpath ) ) # UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f63a' in position 0: character maps to <undefined>
print( "({}) in {}".format( unicodedata.name(char), fontpath ) )
test(u"????")
test(u"????")
如果您按原样运行代码,您会发现它可以工作,因为它会输出如下内容:
$ python3 /tmp/test-font.py
(SMILING CAT FACE WITH OPEN MOUTH) in c:/Windows/Fonts/DejaVuSans-Bold.ttf
(SMILING CAT FACE WITH OPEN MOUTH) in c:/Windows/Fonts/DejaVuSans-BoldOblique.ttf
(SMILING CAT FACE WITH OPEN MOUTH) in c:/Windows/Fonts/DejaVuSans-Oblique.ttf
...
...但是,如果您在已注释的打印件上取消注释,则代码将失败并显示:
$ python3 /tmp/test-font.py
Traceback (most recent call last):
File "C:/msys64/tmp/test-font.py", line 31, in <module>
test(u"\U0001f63a")
File "C:/msys64/tmp/test-font.py", line 29, in test
print( "{} ({}) in {}".format(char, unicodedata.name(char), fontpath ) )
File "C:/msys64/mingw64/lib/python3.8/encodings/cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f63a' in position 0: character maps to <undefined>
这对我来说完全是奇怪的,因为char 是输入变量——它显然可以在系统字体中正确找到——但是,它无法在终端中打印?!?!
有谁知道在这种情况下如何让char 在终端中打印?
【问题讨论】:
-
您的控制台不兼容 UTF-8。 Python 尝试将字符串转换为控制台编码,它发现您的控制台不支持某些字符)。
LANG=en_US.UTF-8应该在 mingw 中解决这个问题。 -
如果从纯
cmd(没有mingw)运行你的脚本会工作吗?我的意思是像python \tmp\test-font.py这样的东西适用于环境变量PYTHONIOENCODING=utf-8... -
感谢@GiacomoCatenazzi 和@JosefZ -
LANG=en_US.UTF-8 python3 /tmp/test-font.py仍然会引发错误,PYTHONIOENCODING=utf-8 python3 /tmp/test-font.py不会引发错误,但会为字符打印??,并从直接cmd.exe运行C:\msys64\mingw64\bin\python3.exe C:\msys64\tmp\test-font.py不会引发错误,但会为字符打印??
标签: python-3.x unicode character-encoding mingw