【问题标题】:Joining two data frames that appear to be same type gives error 'ValueError: You are trying to merge on object and int64 columns'连接两个看起来相同类型的数据框会出现错误“ValueError:您正在尝试合并对象和 int64 列”
【发布时间】:2019-10-13 03:56:45
【问题描述】:

我有两个数据框,会话 1 和会话 2,我想加入字段“ga:dimension1”。

sessions1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15775 entries, 0 to 15774
Data columns (total 9 columns):
ga:dimension1                15775 non-null object
ga:date                      15775 non-null object
ga:deviceCategory            15775 non-null object
ga:landingPagePath           15775 non-null object
ga:userType                  15775 non-null object
ga:operatingSystem           15775 non-null object
ga:operatingSystemVersion    15775 non-null object
ga:sessions                  15775 non-null int64
ga:bounces                   15775 non-null int64
dtypes: int64(2), object(7)
memory usage: 1.1+ MB
sessions2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15774 entries, 0 to 15773
Data columns (total 9 columns):
ga:dimension1         15774 non-null object
ga:source             15774 non-null object
ga:medium             15774 non-null object
ga:campaign           15774 non-null object
ga:adContent          15774 non-null object
ga:keyword            15774 non-null object
ga:channelGrouping    15774 non-null object
ga:sessions           15774 non-null int64
ga:bounces            15774 non-null int64
dtypes: int64(2), object(7)
memory usage: 1.1+ MB

看看前几行,它们至少看起来是一样的:

sessions1.head()
            ga:dimension1   ga:date  ... ga:sessions ga:bounces
0  1567331564026.evxjzuot  20190901  ...           1          1
1  1567331572999.vtnsczsj  20190901  ...           1          1
2  1567331693070.fkdbmcj6  20190901  ...           1          1
3  1567335919816.ctz12xcl  20190901  ...           1          0
4  1567345181556.b3yowmbh  20190901  ...           1          1

sessions2.head()
            ga:dimension1 ga:source  ... ga:sessions ga:bounces
0  1567331564026.evxjzuot  (direct)  ...           1          1
1  1567331572999.vtnsczsj  (direct)  ...           1          1
2  1567331693070.fkdbmcj6  (direct)  ...           1          1
3  1567335919816.ctz12xcl  (direct)  ...           1          0
4  1567345181556.b3yowmbh  (direct)  ...           1          1

但是,当我尝试这个时:

sessions_combined = sessions1.join(sessions2,
                                   on = 'ga:dimension1',
                                   how = 'left')

我收到一条错误消息:

ValueError:您正在尝试合并 object 和 int64 列。如果 你想继续你应该使用 pd.concat

为什么会这样,我应该如何将两个数据框连接在一起?

【问题讨论】:

  • 您需要使用merge 而不是join。 Join 正在尝试将 session1 的索引加入到 session2 的列ga:dimension1

标签: python pandas


【解决方案1】:

使用merge

sessions_combined = sessions1.merge(sessions2,
                                   on = 'ga:dimension1',
                                   how = 'left')

【讨论】:

    猜你喜欢
    • 2020-01-07
    • 1970-01-01
    • 1970-01-01
    • 2019-08-12
    • 2020-07-15
    • 2021-05-17
    • 1970-01-01
    • 1970-01-01
    • 2019-03-05
    相关资源
    最近更新 更多