【问题标题】:Sorting Pandas DataFrames对 Pandas 数据帧进行排序
【发布时间】:2017-06-30 04:27:36
【问题描述】:

我有一个带有汽车数据的 Pandas 数据框。我想为每个 Maker 找到最畅销的两个模型,然后对 Makers 进行排名,降序。

Maker   Model   No Sold(,000s)
Ford    Kuga    35
Ford    Focus   47
Ford    Ka          31
Ford    Fiesta      68
Ford    Mondeo      55
Ford    S-Max       34
Ford    Galaxy      23
Nissan  Leaf        28
Nissan  Micra       31
Nissan  Note            43
Nissan  Pulsar      23
Nissan  Juke            57
Nissan  Qashqai     62
Nissan  X-Trail         38
Honda   Jazz            24
Honda   Civic           32
Honda   HRV         33
Honda   CRV         29
Honda   Accord          30
Honda   NSX         15
Toyota  Aygo            44
Toyota  Auris           45
Toyota  Avensis         35
Toyota  Prius           32
Toyota  Rav4            29
Toyota  Land Cruiser    14
Citroen C1          40
Citroen C3  25
Citroen C4  46
Citroen DS3 35    
Citroen DS4 31
Citroen DS5 25    
Audi    A1  23
Audi    A3  47
Audi    A4  30
Audi    A6  20
Audi    A8  18
BMW 1 Series    36
BMW 2 Series    20
BMW 3 Series    53
BMW 4 Series    21
BMW 5 Series    27
BMW 6 Series    24
BMW 7 Series    16

抱歉,不知道如何将 Dataframe 放在这里。

【问题讨论】:

    标签: python pandas dataframe group-by pandas-groupby


    【解决方案1】:

    使用groupby + nlargest

    df.set_index('Model').groupby('Maker')['No Sold(,000s)'].nlargest(2)
    
    Maker    Model  
    Audi     A3         47
             A4         30
    Citroen  C4         46
             C1         40
    Ford     Fiesta     68
             Mondeo     55
    Honda    HRV        33
             Civic      32
    Nissan   Qashqai    62
             Juke       57
    Toyota   Auris      45
             Aygo       44
    Name: No Sold(,000s), dtype: int64
    

    【讨论】:

      【解决方案2】:

      替代解决方案:

      In [222]: df.sort_values(['Maker', 'No Sold(,000s)'], ascending=[1,0]) \
                  .groupby('Maker', as_index=False).head(2)
      Out[222]:
            Maker     Model  No Sold(,000s)
      33     Audi        A3              47
      34     Audi        A4              30
      39      BMW  3 Series              53
      37      BMW  1 Series              36
      28  Citroen        C4              46
      26  Citroen        C1              40
      3      Ford    Fiesta              68
      4      Ford    Mondeo              55
      16    Honda       HRV              33
      15    Honda     Civic              32
      12   Nissan   Qashqai              62
      11   Nissan      Juke              57
      21   Toyota     Auris              45
      20   Toyota      Aygo              44
      

      PS 请注意:@piRSquared's solution 更惯用,应该更快

      【讨论】:

        【解决方案3】:

        我相信你也可以这样做:

        df[df.groupby(by=['maker'])["no sold(000's)"].rank() <= 2]
        

        【讨论】:

          猜你喜欢
          • 2017-03-08
          • 2019-04-26
          • 2019-09-23
          • 2015-09-16
          • 2013-10-15
          • 2017-08-21
          • 2020-05-19
          • 2019-10-13
          • 2013-06-05
          相关资源
          最近更新 更多