如何从我从Twitter API收集的数据生成网络矩阵?

时间:2013-05-01 01:06:58

标签: matplotlib twitter redis networkx

这里有一种Python菜鸟。我有来自Matthew Russell的书籍“21 Recipes for Mining Twitter”和“挖掘社交网络”的Python代码,我想用它来收集来自Twitter API的各种数据的项目在这里查看他的github页面:{{3} }

我无法弄清楚的一件事是如何根据用户与他/她的粉丝/朋友之间的关系生成网络矩阵/图表。例如,这是他用于在Twitter上收集用户朋友的Python代码(也在这里:https://github.com/ptwobrussell):

# -*- coding: utf-8 -*-

import sys
import twitter
from recipe__make_twitter_request import make_twitter_request
import functools

SCREEN_NAME = sys.argv[1]
MAX_IDS = int(sys.argv[2])

if __name__ == '__main__':

    # Not authenticating lowers your rate limit to 150 requests per hr. 
    # Authenticate to get 350 requests per hour.

    t = twitter.Twitter(domain='api.twitter.com', api_version='1')

    # You could call make_twitter_request(t, t.friends.ids, *args, **kw) or 
    # use functools to "partially bind" a new callable with these parameters

    get_friends_ids = functools.partial(make_twitter_request, t, t.friends.ids)

    # Ditto if you want to do the same thing to get followers...

    # getFollowerIds = functools.partial(make_twitter_request, t, t.followers.ids)

    cursor = -1
    ids = []
    while cursor != 0:

        # Use make_twitter_request via the partially bound callable...

        response = get_friends_ids(screen_name=SCREEN_NAME, cursor=cursor)
        ids += response['ids']
        cursor = response['next_cursor']

        print >> sys.stderr, 'Fetched %i total ids for %s' % (len(ids), SCREEN_NAME)

        # Consider storing the ids to disk during each iteration to provide an 
        # an additional layer of protection from exceptional circumstances

        if len(ids) >= MAX_IDS:
        break

    # Do something useful with the ids like store them to disk...

    print ids 

所以我设法成功运行此代码,并将给定用户作为命令行参数的主要用户。但是,我如何实际将这些数据转换为矩阵,然后我可以分析,运行公式(如中心)等......?到目前为止,我已经想过我可能需要使用可能包含NetworkX,Redis和Matplotlib的软件包组合,但实际生成此矩阵的步骤不适合我。

1 个答案:

答案 0 :(得分:0)

您可以将数据存储在数据库或文件中。最好根据您将用于分析数据支持的软件进行选择。

以下是.gdf格式的文件示例,您可以存储节点和边数据:

nodedef> id VARCHAR, label VARCHAR, followerCount VARCHAR
1623,jchris,5610
13348,Scobleizer,319673
21213,tlg,1141
...
edgedef> user VARCHAR,friend VARCHAR
1623,13348
1623,621713
...

您在示例中引用的代码执行提取边的部分,您仍需要另一个提取步骤来提取节点。