Python3如何在没有编码的情况下获取原始字节字符串？答案

【问题标题】：Python3 How to get raw bytes string without encode?Python3如何在没有编码的情况下获取原始字节字符串？
【发布时间】：2020-09-18 04:46:18
【问题描述】：

我想得到一串原始字节（汇编代码）而不编码为另一种编码。由于字节的内容是shellcode，我不需要对其进行编码，而是想直接将其写为原始字节。通过简化，我想将 "b'\xb7\x00\x00\x00'" 转换为 "\xb7\x00\x00\x00" 并获取原始字节的字符串表示形式。例如：

>> byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
>> uc_str = str(byte_code)[2:-1] 
>> print(byte_code, uc_str)
b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00' \xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00

目前我只有两个丑陋的方法，

>> uc_str = str(byte_code)[2:-1]
>> uc_str = "".join('\\x{:02x}'.format(c) for c in byte_code)

原始字节使用情况：

>> my_template = "const char byte_code[] = 'TPL'"
>> uc_str = str(byte_code)[2:-1]
>> my_code = my_template.replace("TPL", uc_str)
# then write my_code to xx.h

有没有pythonic方法可以做到这一点？

【问题讨论】：

标签： python-3.x

【解决方案1】：

您的第一个方法被破坏了，因为任何可以表示为可打印 ASCII 的字节都将是，例如：

>>> str(b'\x00\x20\x41\x42\x43\x20\x00')[2:-1]
'\\x00 ABC \\x00'

第二种方法其实还可以。由于 stdlib 中似乎缺少此功能，因此我发布了提供该功能的 all-escapes。

pip install all-escapes

示例用法：

>>> b"\xb7\x00\x00\x00".decode("all-escapes")
'\\xb7\\x00\\x00\\x00'

【讨论】：

这是因为我想将 byte_code 插入字符串模板，然后写入 .cpp 文件。
那么问题中的措辞有点误导-您不想直接将其写为原始字节，而是要编写字节的字符串表示形式。
是的。我已经修改了问题中的描述，并举了一个简单的例子说明为什么我需要这样做。

【解决方案2】：

我在尝试用一些 SNMP 代码做类似的事情时遇到了这个问题。

byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
text = byte_code.decode('raw_unicode_escape')
writer_func(text)

当没有对 hex 的帮助器支持时，它可以将 SNMP Hex 字符串作为 OctetString 发送。

另请参阅standard-encodings 和 bytes decode

对于任何查看 SNMP 的人Set Types

【讨论】：

【解决方案3】：

转换字节/字符串的基础是这样的：

>>> b"abc".decode()
'abc'
>>>

或：

>>> sb = b"abc"
>>> s = sb.decode()
>>> s
'abc'
>>>

倒数是：

>>> "abc".encode()
b'abc'
>>>

或：

>>> s="abc"
>>> sb=s.encode()
>>> sb
b'abc'
>>>

在你的情况下，你应该使用错误参数：

>>> b"\xb7".decode(errors="replace")
'�'
>>>

【讨论】：