【发布时间】:2021-07-09 09:28:08
【问题描述】:
我有一个脚本,它根据关键字查询返回推文,然后将其附加到 CSV。我看不出为什么在我的脚本中,每次运行它只会返回 200 条推文。这不是count 参数,因为据我所知,这会返回每页返回的推文数,最多为 100。
谁能看到发生了什么?
def twitter_search(twitter_api, q, max_results = 3000, **kw):
search_results = twitter_api.search.tweets(q=q, count = 100, **kw, lang = 'en', tweet_mode='extended', )
statuses = search_results['statuses']
#Iterate through batches of results until we get the number we want
#Enforce a reasonable limit
max_results = min(5000, max_results)
for _ in range(100):
try:
next_results = search_results['search_metadata']['next_results']
except KeyError as e: #no more results when next_results doesn't exist
break
#create a dictionary from next_results
kwargs = dict([kv.split('=') for kv in next_results[1:].split("&")])
search_results = twitter_api.search.tweets(**kwargs)
statuses += search_results['statuses']
if len(statuses) > max_results:
break
return statuses
我认为这与光标迭代下一批结果有关,但我不知道为什么会这样......
【问题讨论】:
-
我很难理解如何在此页面上集成光标功能:developer.twitter.com/en/docs/pagination
-
试一试。您想从查询中返回值吗?