如何使用u转义代码对Python3字符串进行编码？

2022-02-21 00:00:00 python python-3.x unicode unicode-escapes

在Python 3中，假设我有

>>> thai_string = 'สีเ'

使用encode可提供

>>> thai_string.encode('utf-8')
b'xe0xb8xaaxe0xb8xb5'

我的问题：如何使用u而不是x让encode()返回bytes序列？我如何才能将它们decode恢复为Python 3str类型？

我尝试使用ascii内置，它提供

>>> ascii(thai_string)
"'\u0e2a\u0e35'"

但这似乎不太正确，因为我无法将其解码回以获得thai_string。

Python documentation告诉我

文档说u只在字符串文字中使用，但我不确定这是什么意思。这是否暗示我的问题有缺陷的前提？

可以使用unicode_escape：

>>> thai_string.encode('unicode_escape')
b'\u0e2a\u0e35\u0e40'

请注意，encode()将始终返回一个字节字符串(字节)和unicode_escape编码is intended to：

在Python源代码中生成适合作为Unicode文本的字符串

相关文章