【问题标题】:Sort alphabetically in pandas dataframe在熊猫数据框中按字母顺序排序
【发布时间】:2021-01-06 11:58:01
【问题描述】:

我有一个数据框

Counties                        Numbers
Yabucoa Municipio, Puerto Rico  7766
Marion County, West Virginia    8756
Barbour County, Alabama         33445
Santa Cruz County, Arizona      447
Navajo County, Arizona          1500
Denver County, Colorado         67990

我正在尝试以一种方式进行排序,即州名按字母顺序排序,县名在州内进行内部排序

Counties                        Numbers
Barbour County, Alabama         33445
Navajo County, Arizona          1500
Santa Cruz County, Arizona      447
Denver County, Colorado         67990
Yabucoa Municipio, Puerto Rico  7766
Marion County, West Virginia    8756

数据框代码:

df_test = pd.DataFrame([
    {'Counties': 'Yabucoa Municipio, Puerto Rico','Numbers': 7766},
    {'Counties': 'Marion County, West Virginia','Numbers': 8756},
    {'Counties': 'Barbour County, Alabama','Numbers': 33445},
    {'Counties': 'Santa Cruz County, Arizona','Numbers': 447},
    {'Counties': 'Navajo County, Arizona','Numbers': 1500},
    {'Counties': 'Denver County, Colorado','Numbers': 67990}
])

我已经尝试过使用sortsplit 代码,但它没有提供所需的输出

df_test['Counties'] = df_test['Counties'].apply(lambda x: ','.join(sorted(x.split(','))))

应该怎么做?请帮忙。谢谢!

【问题讨论】:

    标签: python pandas dataframe sorting


    【解决方案1】:

    一种方法可能如下:

    df = pd.DataFrame(
        [
            {"Counties": "Yabucoa Municipio, Puerto Rico", "Numbers": 7766},
            {"Counties": "Marion County, West Virginia", "Numbers": 8756},
            {"Counties": "Barbour County, Alabama", "Numbers": 33445},
            {"Counties": "Santa Cruz County, Arizona", "Numbers": 447},
            {"Counties": "Navajo County, Alabama", "Numbers": 1500},
            {"Counties": "Denver County, Colorado", "Numbers": 67990},
        ]
    )
    

    然后创建一个键来重新排序:

    re_order_key = (
        df["Counties"]
        .str.split(",", expand=True)
        .rename(columns={0: "county", 1: "state"})
        .sort_values(by=["state", "county"])
    )
    

    将此索引与 iloc 一起使用:

    df.iloc[re_order.index, :].reset_index(drop=True)
    

    给出:

                             Counties  Numbers
    0         Barbour County, Alabama    33445
    1          Navajo County, Alabama     1500
    2      Santa Cruz County, Arizona      447
    3         Denver County, Colorado    67990
    4  Yabucoa Municipio, Puerto Rico     7766
    5    Marion County, West Virginia     8756
    

    【讨论】:

    • 请注意 OP - 我觉得有比上述更好的方法,所以鼓励他们稍等片刻,因为有人可能会提供更好的东西。
    猜你喜欢
    • 2021-11-05
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-01-10
    • 2020-12-10
    • 2017-05-16
    • 2017-04-23
    • 2018-07-05
    相关资源
    最近更新 更多