【问题标题】:Count unique occurrences per day from two or more columns pandas从两列或更多列 pandas 中计算每天唯一出现的次数
【发布时间】:2020-08-26 08:42:46
【问题描述】:

我想从两列中统计每天出现的唯一名称:

df = pd.DataFrame({
    'ColA':['john wick','bloody mary','peter pan','jeff bridges','billy boy'],
    'ColB':['bloody mary','jeff bridges','billy boy','billy boy','john wick'],
    'date':['2000-01-01', '2000-01-01', '2000-01-03', '2000-01-03', '2000-01-03'],})
datetime_series = pd.to_datetime(df['date'])
datetime_index = pd.DatetimeIndex(datetime_series.values)
df2 = df.set_index(datetime_index)
df2.drop('date',axis=1,inplace=True)
df2
Out[746]: 
                    ColA          ColB
2000-01-01  john wick     bloody mary 
2000-01-01  bloody mary   jeff bridges
2000-01-03  peter pan     billy boy   
2000-01-03  jeff bridges  billy boy   
2000-01-03  billy boy     john wick   

以便我获得一个系列或类似以下内容:

           unique occurrences of names
2000-01-01             3
2000-01-03             4

【问题讨论】:

    标签: python pandas count unique


    【解决方案1】:

    使用DataFrame.stackDataFrameGroupBy.nunique 和最后一个Series.to_frame

    df3 = df2.stack().groupby(level=0).nunique().to_frame(name='unique occurrences of names')
    print (df3)
                unique occurrences of names
    2000-01-01                            3
    2000-01-03                            4
    

    或者用DataFrame.melt替代:

    df3 = (df2.reset_index()
              .melt('index')
              .groupby('index')['value']
              .nunique()
              .to_frame(name='unique occurrences of names'))
    

    【讨论】:

      猜你喜欢
      • 2016-06-02
      • 2020-03-07
      • 1970-01-01
      • 2019-03-08
      • 2018-11-17
      • 2016-11-11
      • 2021-07-25
      • 2023-02-08
      • 1970-01-01
      相关资源
      最近更新 更多