【发布时间】:2016-04-22 09:38:04
【问题描述】:
我有一个 Twitter 网名列表,并且想为每个网名收集 3200 条推文。以下是我改编自https://gist.github.com/yanofsky/5436496的代码
#initialize a list to hold all the tweepy Tweets
alltweets = []
#screen names
r=['user_a', 'user_b', 'user_c']
#saving tweets
writefile=open("tweets.csv", "wb")
w=csv.writer(writefile)
for i in r:
#make initial request for most recent tweets (200 is the maximum allowed count)
new_tweets = api.user_timeline(screen_name = i, count=200)
#save most recent tweets
alltweets.extend(new_tweets)
#save the id of the oldest tweet less one
oldest = alltweets[-1].id - 1
#keep grabbing tweets until there are no tweets left to grab
while len(new_tweets) > 0:
print "getting tweets before %s" % (oldest)
#all subsiquent requests use the max_id param to prevent duplicates
new_tweets = api.user_timeline(screen_name = i[0],count=200,max_id=oldest)
#save most recent tweets
alltweets.extend(new_tweets)
#update the id of the oldest tweet less one
oldest = alltweets[-1].id - 1
print "...%s tweets downloaded so far" % (len(alltweets))
#write the csv
for tweet in alltweets:
w.writerow([i, tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")])
writefile.close()
最后,最终的 csv 文件包含 user_a 的 3200 条推文,user_b 的大约 6400 条推文和 user_c 的 9600 条推文。上述代码中有些地方不正确。每个用户应该有大约 3200 条推文。谁能指出我的代码有什么问题?谢谢。
【问题讨论】: