使用twitter4j和java获取所有用户时间线推文

时间:2018-06-13 15:10:32

标签: java twitter twitter4j

如果有人可以提供帮助我有问题, 我正在尝试让特定用户完成推文,这是我的代码:

    Paging pg = new Paging();
    String userName = "Obama";
    pg.setCount(200);
    ConfigurationBuilder cb = new ConfigurationBuilder();
  cb.setOAuthConsumerKey("");
  cb.setOAuthConsumerSecret("");
  cb.setOAuthAccessToken("");
  cb.setOAuthAccessTokenSecret("");


  Twitter twitter = new TwitterFactory(cb.build()).getInstance();
  int numberOfTweets = 1000000;
  long lastID = Long.MAX_VALUE;
  ArrayList<Status> tweets = new ArrayList<Status>();
  while (tweets.size () < numberOfTweets) {


  tweets.addAll(twitter.getUserTimeline(userName,pg));
  //System.out.println("Gathered " + tweets.size() + " tweets");
  for (Status t: tweets) {
    System.out.println(t.getUser().getName() + ": " + t.getText()+ " " );


}; 
    pg.setMaxId(lastID-1);
  }
        System.out.println(tweets.size());

    }

问题是结果只有相同的结果,算法只从时间轴中获取前几条推文并使它们成为X时间,而配置文件有数百万条推文。 有人能告诉我怎样才能解决这个问题? 感谢

1 个答案:

答案 0 :(得分:1)

这是一种方法:

ArrayList<Status> statuses = new ArrayList<>();
int pageno = 1;
while(true) {
    try {
        System.out.println("getting tweets");
        int size = statuses.size(); // actual tweets count we got
        Paging page = new Paging(pageno, 200);
        statuses.addAll(twitter.getUserTimeline(screenName, page));
        System.out.println("total got : " + statuses.size());
        if (statuses.size() == size) { break; } // we did not get new tweets so we have done the job
        pageno++;
        sleep(1000); // 900 rqt / 15 mn <=> 1 rqt/s
        }
    catch (TwitterException e) {
        System.out.println(e.getErrorMessage());
        }
    } // while(true)

你需要一个睡眠功能来尊重速度限制:

static void sleep(long ms) {
    try { Thread.sleep(ms); }
    catch(InterruptedException ex) { Thread.currentThread().interrupt(); }
    }

参考:https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html