Question

我正在通过使用请求从Facebook获取一些数据。这是示例数据。

response = {'message': 'I have recommended your  name to all my family n friend
s. Thankyou!!!!\\ud83d\\ude0a\\ud83d\\ude0a\\ud83e\\udd17\\ud83e\\udd17\\ud83d\\udc4c\\ud83d\\udc4c\\ud83d\\udc4d\\ud83d\\udc4
}

最后几个字符是表情符号。但是当我需要将其保存在数据库中时。

所以我尝试先将其转换为字典，以便我可以添加键并操作数据：

response = json.loads(response.content, encoding='utf-8')

但是当我做print(response)时，我会得到类似的东西

       {
'message': 'I have recommended your  name to all my family n friend
        s. Thankyou!!!!__ __ __ __ __ __ __
        }

从数据库我得到这个错误：

Incorrect string value: '\xF0\x9F\x98\x8A\xF0\x9F...'

我得到的编码是什么？我如何转换它以便可以将其存储在databse（mysql）

Answer 1

您可以使用unicodedata：

title = u"Klüft skräms inför på fédéral électoral große"
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
'Kluft skrams infor pa federal electoral groe'

或仅将字符替换为您自己指定的字符以供以后用作表情符号：

>>> a=u"aaaàçççñññ"
>>> type(a)
<type 'unicode'>
>>> a.encode('ascii','ignore')
'aaa'
>>> a.encode('ascii','replace')
'aaa???????'
>>>

或首先将其编码为可以存储的特定表示形式。有几种常见的Unicode编码，例如UTF-16（大多数Unicode字符使用两个字节）或UTF-8（1-4个字节/代码点，取决于字符）等。要将该字符串转换为特定的编码，您可以可以使用：

>>> s= u'£10'
>>> s.encode('utf8')
'\xc2\x9c10'
>>> s.encode('utf16')
'\xff\xfe\x9c\x001\x000\x00'

Answer 2

这是unicode。您必须在存储时解码字符串并在打印时编码

了解python数据编码问题

2 个答案: