【发布时间】:2020-06-05 01:28:51
【问题描述】:
将地理代码“纬度”和“经度”反转为美国地区邮政编码的代码;最初用于确定纽约市枪击事件的邮政编码。
【问题讨论】:
标签: python pandas reverse-geocoding zipcode
将地理代码“纬度”和“经度”反转为美国地区邮政编码的代码;最初用于确定纽约市枪击事件的邮政编码。
【问题讨论】:
标签: python pandas reverse-geocoding zipcode
示例输出:
lat lon zipcode
0 40.896504 -73.859042 10470
1 40.732804 -74.005666 10014
2 40.674142 -73.936206 11213
3 40.648025 -73.904011 11236
4 40.764694 -73.914348 11103
... ... ... ...
20654 40.710989 -73.942949 11211
20655 40.682398 -73.840079 11416
20656 40.651014 -73.945707 11226
20657 40.835990 -73.916276 10452
20658 40.857771 -73.894606 10458
加载数据集(非必需):
#load used dataset
df_shooting = pd.read_csv('Shooting_NY.csv',sep=';',low_memory=False)
反向地理编码代码:
pip install uszipcode
# Import packages
from uszipcode import SearchEngine
search = SearchEngine(simple_zipcode=True)
from uszipcode import Zipcode
import numpy as np
#define zipcode search function
def get_zipcode(lat, lon):
result = search.by_coordinates(lat = lat, lng = lon, returns = 1)
return result[0].zipcode
#load columns from dataframe
lat = df_shooting['Latitude']
lon = df_shooting['Longitude']
#define latitude/longitude for function
df = pd.DataFrame({'lat':lat, 'lon':lon})
#add new column with generated zip-code
df['zipcode'] = df.apply(lambda x: get_zipcode(x.lat,x.lon), axis=1)
#print result
print(df)
#(optional) save as csv
#df.to_csv(r'zip_codes.csv')
注意较长的运行时间(20k 行 = 5-7 分钟)。然而,我们设法在不利用(付费)Google API 的情况下找出最有效的代码。
【讨论】:
【讨论】:
这是我的代码,我认为它更容易一点:
# !pip install uszipcode
# Import packages
from uszipcode import SearchEngine
search = SearchEngine(simple_zipcode=True)
from uszipcode import Zipcode
# Define zipcode search function
for index, row in df.iterrows():
result = search.by_coordinates(lat = row[df lat column number], lng = row[df lon column number], returns = 1)
zip = result[0].zipcode
# Add zipcode to the dataframe
df["Zipcode"] = zip
# Save dataframe to csv file (specify path)
df.to_csv("Resouces/df.csv", index=False)
# You can also use itertuples(). It is really faster than iterrows()
# Your for loop may change like the following
for row in df.itertuples(index = False):
# follow remaining code explained above
【讨论】: