使用Python Panda聚合操作答案

【问题标题】：Using Python Panda aggregates operation使用Python Panda聚合操作
【发布时间】：2022-12-04 20:44:43
【问题描述】：

我有一张这样的桌子-

Hotel         Earning

Abu            1000

Zain            400

Show            500

Zint            300

Abu             500

Zain            700

Abu             500

Abu             500

Abu             800

Abu             1600

Show            1300

Zint            600

使用熊猫，如何按酒店分组并计算最小值、中值和最大值每家酒店的收入。最后打印聚合值酒店名称“Abu”。

输出：

[500.0, 650.0, 1600.0]

【问题讨论】：

这回答了你的问题了吗？ Pandas how to apply multiple functions to dataframe

标签： python pandas

【解决方案1】：

Pandas DataFrame aggregate() 方法 aggregate() 方法允许您应用一个函数或函数名称列表，以沿 DataFrame 的轴之一执行，默认为 0，即索引（行）轴。注意：agg() 方法是 aggregate() 方法的别名。

【讨论】：

【解决方案2】：

import pandas as pd

# Read the data into a Pandas DataFrame
df = pd.read_csv('hotel_earnings.csv')

# Group the data by hotel
hotels = df.groupby('Hotel')

# Calculate the min, median, and max of the earning for each hotel
earnings = hotels['Earning'].agg(['min', 'median', 'max'])

# Print the aggregated values for the hotel named "Abu"
print(earnings.loc['Abu'])

此代码将 hotel_earnings.csv 文件中的数据读取到 Pandas DataFrame 中，按酒店对数据进行分组，并计算每家酒店的最低收入、中位数收入和最高收入。然后打印名为“Abu”的酒店的聚合值

【讨论】：

【解决方案3】：

要按酒店对数据进行分组并计算每家酒店收入的最小值、中值和最大值，您可以使用 Pandas DataFrame 的 groupby 和 agg 方法。这是一个例子：

import pandas as pd

# Create a DataFrame
df = pd.DataFrame(
    {
        "Hotel": ["Abu", "Zain", "Show", "Zint", "Abu", "Zain", "Abu", 
        "Abu", "Abu", "Abu", "Show", "Zint"],
        "Earning": [1000, 400, 500, 300, 500, 700, 500, 500, 800, 1600, 1300, 600],
    }
)

# Group the data by hotel and calculate the min, median, and max of the earning
df_grouped = df.groupby("Hotel").agg(["min", "median", "max"])

# Print the aggregates values for the hotel "Abu"
print(df_grouped.loc["Abu"])

在上面的代码中，首先，我们使用给定的数据创建一个 Pandas DataFrame。然后，我们按酒店对数据进行分组，并使用 groupby 和 agg 方法计算每家酒店收入的最小值、中值和最大值。最后，我们使用 DataFrame 的 loc 方法打印酒店“Abu”的聚合值。输出将是：

      Earning
        min median   max
Abu     500    650  1600

然后，您可以使用 DataFrame 的 iloc 方法访问最小值、中值和最大值。这是一个例子：

# Access the values of the min, median, and max for the hotel "Abu"
print(df_grouped.loc["Abu"].iloc[0])

输出将是：

min       500
median    650
max      1600
Name: Earning, dtype: int64

然后，您可以使用 tolist 方法将值转换为列表：

# Convert the values of the min, median, and max to a list
print(df_grouped.loc["Abu"].iloc[0].tolist())

输出将是：

[500.0, 650.0, 1600.0]

【讨论】：