【发布时间】:2021-12-19 06:32:22
【问题描述】:
我想提取推文,然后将它们用于预测分析模型。
我有一个标准的 twitter 开发者帐户,我创建了一个项目,我在该项目下创建了一个应用程序,我正在使用该应用程序的令牌。
我的代码如下:
import os
os.environ['TOKEN'] = 'I put my token here'
def auth():
return os.getenv('TOKEN')
def create_headers(bearer_token):
headers = {"Authorization": "Bearer {}".format(bearer_token)}
return headers
def create_url(keyword, start_date, end_date, max_results = 10):
search_url = "https://api.twitter.com/2/tweets/search/all" #Change to the endpoint you want to collect data from
#change params based on the endpoint you are using
query_params = {'query': keyword,
'start_time': start_date,
'end_time': end_date,
'max_results': max_results,
'expansions': 'author_id,in_reply_to_user_id,geo.place_id',
'tweet.fields': 'id,text,author_id,in_reply_to_user_id,geo,conversation_id,created_at,lang,public_metrics,referenced_tweets,reply_settings,source',
'user.fields': 'id,name,username,created_at,description,public_metrics,verified',
'place.fields': 'full_name,id,country,country_code,geo,name,place_type',
'next_token': {}}
return (search_url, query_params)
def connect_to_endpoint(url, headers, params, next_token = None):
params['next_token'] = next_token #params object received from create_url function
response = requests.request("GET", url, headers = headers, params = params)
print("Endpoint Response Code: " + str(response.status_code))
if response.status_code != 200:
raise Exception(response.status_code, response.text)
return response.json()
#Inputs for the request
bearer_token = auth()
headers = create_headers(bearer_token)
keyword = "xbox lang:en"
start_time = "2021-03-01T00:00:00.000Z"
end_time = "2021-03-31T00:00:00.000Z"
max_results = 15
url = create_url(keyword, start_time,end_time, max_results)
json_response = connect_to_endpoint(url[0], headers, url[1])
当我运行最后 2 行时,我收到以下错误:
(403, '{"client_id":"22361938","detail":"When authenticating requests to the Twitter API v2 endpoints, you must use keys and tokens from a Twitter developer App that is attached to a Project. You can create a project via the developer portal.","registration_url":"https://developer.twitter.com/en/docs/projects/overview","title":"Client Forbidden","required_enrollment":"Standard Basic","reason":"client-not-enrolled","type":"https://api.twitter.com/2/problems/client-forbidden"}')
或有时
(401, '{"errors":[{"message":"Invalid or expired token","code":89}]}\n')
我在面向数据科学和 kaggle 上找到了这段代码,并想尝试运行它。我还想只从特定国家(印度)获取推文,我知道我需要使用 place_country 但我不知道该怎么做。我想做的另一件事是获取前一天的所有推文(不仅仅是那些带有查询关键字的推文)(不仅仅是我的代码中的 10 条推文)。 如果有人可以指导我使用工作代码来提取推文,那也很棒。
【问题讨论】:
标签: python machine-learning twitter twitterapi-python