【发布时间】:2020-10-16 08:41:42
【问题描述】:
test.csv 数据是这样的:
device_id,upload_time,latitude,longitude,mileage,other_vals,speed,upload_time_1
11115304371,2020-08-05 05:10:05+00:00,23.140366,114.18685,0,,0,202008
1234,2020-08-05 05:10:33+00:00,22.994716,114.2998,0,,0,202008
11115304371,2020-08-05 05:20:55+00:00,22.994716,114.2998,0,,3.8,202008
11115304371,2020-08-05 05:24:02+00:00,22.994916,114.299683,0,,2.1,202008
11115304371,2020-08-05 05:24:30+00:00,22.99545,114.2998,0,,6.5,202008
11115304371,2020-08-05 05:29:30+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:34:30+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:39:30+00:00,22.995433,114.299766,0,,3.4,202008
822649e2d142a486,2020-08-05 05:44:30+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:44:53+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:45:40+00:00,22.995433,114.299766,0,,5.8,202008
而且 info.csv 数据是这样的:
car_id,device_id,car_type,car_num,marketer_name
1,11110110037,1,AAA,T1
2,11115304371,1,BBB,T2
3,11111100345,1,CCC,T3
4,11111100242,1,DDD,T4
5,12221100034,1,EEE,T5
6,12221100230,1,FFF,T6
7,14465301234,1,GGG,T7
当我使用此代码合并 2 个数据框时。
import pandas as pd
df_device_data = pd.read_csv(r'E:/test.csv', encoding='utf-8', parse_dates=[1], low_memory=False)
df_common_car_info = pd.read_csv(r'E:/info.csv', encoding='utf-8', low_memory=False)
result = pd.merge(df_device_data, df_common_car_info, how='left', on='device_id')
result.to_csv(r'E:/result.csv', index=False, mode='w', header=True)
发生了这个错误:
ValueError:您正在尝试合并 object 和 int64 列。如果 你想继续你应该使用 pd.concat
如何解决?
【问题讨论】:
-
因为这个 device_id "822649e2d142a486",你的 test.csv device_id 是一个对象类型,而另一个文件中的 device_id 是一个 int。将 info.csv 中的 deviceid 转换为字符串。