在 Pandas 中分组或不分组的条件答案

【问题标题】：Condition to groupby or not groupby in Pandas在 Pandas 中分组或不分组的条件
【发布时间】：2020-01-03 19:58:32
【问题描述】：

如果从用户输入中做出某个选择，我正在尝试创建一个“条件分组”。如果 'ZONE' 列存在于我想按该区域分组的数据框中，然后遍历特征列表 ['Var1', 'Var2']。

除非没有 'ZONE' 列，否则我只想遍历没有 groupby 的功能列表。

我的伪代码示例是：

import pandas as pd

data = pd.DataFrame({'County' : [1, 2, 2, 2, 3, 3], 'ZONE' : [88, 88, 19, 19, 10, 19], 'Var1' : [78, 90, 97, 100, 12, 140], 'Var2' : [56, 92, 122, 134, 120, 140]})

features = ['Var1', 'Var2']

if 'ZONE' in data.columns:
    data_grouped = data.groupby(['ZONE'])
if 'ZONE' not in data.columns:
    data_grouped = data.copy()

# iterate over grouped zone data
for zone, zone_data in data_grouped:
# iterate over feature columns
      for feature in features:
          data_feature = data_grouped[feature]
          print(data_feature)
          ......make graphs and other things with this grouped data.....

上面的代码适用于 groupby ZONE 案例，但如果没有 ZONE，我不知道如何忽略这个 groupby 并仅在单个 for 循环中迭代功能 - 我想要一个 for 循环对于每种情况，而不是打破两种情况并重复一堆图形代码。

有没有办法做到这一点？也许是一个 itertools 解决方案？

【问题讨论】：

try : data_grouped=data.groupby('ZONE') except KeyError: data_grouped=data.loc[:,features]?
当没有 Zone 时，您只需要一个 DataFrame，或者在这种情况下您是否希望遍历行？

标签： python python-3.x pandas group-by

【解决方案1】：

我不确定你想要什么最终结果。我认为有更好的方法可以在没有循环的情况下实现最终结果。

无论如何，有一个简单的方法来处理你想要的：

if 'ZONE' in data.columns:
    data_grouped = data.groupby(['ZONE'])
if 'ZONE' not in data.columns:
    data_grouped = ['NoZone', data]

for zone, zone_data in data_grouped:
# iterate over feature columns
      for feature in features:
          data_feature = zone_data[feature]
          print(data_feature)
          ......make graphs and other things with this grouped data.....

【讨论】：