在提取Twitter数据时发生UnicodeEncodeError

时间:2014-12-20 23:20:32

标签: python-3.x twitter tweepy python-unicode

我正在尝试根据Hash-tag提取推文。下面是我的Python源代码

import tweepy

consumer_key = "##"
consumer_secret = "##"
access_key = "##"
access_secret = "##"

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)

def main():
    api = tweepy.API(auth)
    search_string = "#cancer"
    search_results = api.search(
        q=search_string, lang='en', count=100, until='2014-12-19')

    for i in search_results:
        print(i)

if __name__ == '__main__':
    main()

此程序抛出以下 UnicodeEncodeError

Traceback (most recent call last):
  File "C:\Users\Jaggy Paw\Workspace\TwitterDataExtraction\streaming.py", line 26, in <module>
    main()
  File "C:\Users\Jaggy Paw\Workspace\TwitterDataExtraction\streaming.py", line 23, in main
    print(i)
  File "C:\Users\Jaggy Paw\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2708' in position 1229: character maps to <undefined> 

如何解决此错误?

1 个答案:

答案 0 :(得分:1)

耶!!!我工作了几个小时后,我自己解决了这个问题。

当我在Windows7上运行程序时,在控制台上打印“utf-8”字符时会出现问题。

所以我将以下行添加到我的代码中,而不修改代码的任何部分(在问题中提到)

sys.stdout = codecs.getwriter('utf8')(sys.stdout.buffer, 'strict')
sys.stderr = codecs.getwriter('utf8')(sys.stderr.buffer, 'strict')