【问题标题】:Python Tweepy get all tweets based on GeocodePython Tweepy 基于 Geocode 获取所有推文
【发布时间】:2020-05-30 11:40:43
【问题描述】:

我正在尝试获取围绕给定坐标的特定半径内的所有推文。该脚本实际上有效,但返回零条目。奇怪的是几天前完全一样的代码对我有用,现在它没有,我被卡住了:(

import tweepy
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
import pandas as pd
import tweepy

#Twitter credentials for the app
consumer_key = 'xxx'
consumer_secret = 'xxx'
access_key= 'xxx'
access_secret = 'xxx'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)

#Create list for column names
COLS = ['id','created_at','lang','original text','user_name', 'place', 'place type', 'bbx', 'coordinates']

geo='48.136353, 11.575004, 25km'

def write_tweets(keyword):

    #create dataframe from defined column list
    df = pd.DataFrame(columns=COLS)

    #iterate through pages with given condition
    #using tweepy.Cursor object with items() method
    for page in tweepy.Cursor(api.search, q=keyword,
                                  include_rts=False,
                                  geocode=geo).pages():

                for tweet in page:
                    #creating string array
                    new_entry = []

                    #storing all JSON data from twitter API
                    tweet = tweet._json    

                    #Append the JSON parsed data to the string list:

                    new_entry += [tweet['id'], tweet['created_at'], tweet['lang'], tweet['text'], 
                                  tweet['user']['name']]

                    #check if place name is available, in case not the entry is named 'no place'
                    try:
                        place = tweet['place']['name']
                    except TypeError:
                        place = 'no place'
                    new_entry.append(place)

                    try:
                        place_type = tweet['place']['place_type']
                    except TypeError:
                        place_type = 'na'
                    new_entry.append(place_type)

                    try:
                        bbx = tweet['place']['bounding_box']['coordinates']
                    except TypeError:
                        bbx = 'na'
                    new_entry.append(bbx)

                    #check if coordinates is available, in case not the entry is named 'no coordinates'
                    try:
                        coord = tweet['coordinates']['coordinates']
                    except TypeError:
                        coord = 'no coordinates'
                    new_entry.append(coord)

                    # wrap up all the data into a data frame
                    single_tweet_df = pd.DataFrame([new_entry], columns=COLS)
                    df = df.append(single_tweet_df, ignore_index=True)

                    #get rid of tweets without a place
                    df_cleaned = df[df.place != 'no place']


    print("tweets with place:")
    print(len(df[df.place != 'no place']))

    print("tweets with coordinates:")
    print(len(df[df.coordinates != 'no coordinates']))

    df_cleaned.to_csv('tweets_'+geo+'.csv', columns=COLS,index=False)

#declare keywords as a query
keyword='*'

#call main method passing keywords and file path
write_tweets(keyword)

地理编码实际上应该像这样工作。

有人有想法吗?

【问题讨论】:

  • 你有没有想过这个问题?

标签: python twitter tweepy geocode


【解决方案1】:

当您声明变量 geo 时,不要在逗号和数字之间留下任何空格。

应该是这样的:

geo='48.136353,11.575004,25km'

【讨论】:

    猜你喜欢
    • 2011-12-27
    • 2017-08-20
    • 1970-01-01
    • 1970-01-01
    • 2015-10-08
    • 1970-01-01
    • 2022-08-11
    • 2014-11-09
    • 2020-12-26
    相关资源
    最近更新 更多