使用 pandas groupby 的 csv 列问题答案

【问题标题】：Issue with columns in csv using pandas groupby使用 pandas groupby 的 csv 列问题
【发布时间】：2018-11-05 01:09:03
【问题描述】：

我的 csv 中有以下这些列。通常所有这些列的值都如下所示，并且代码运行顺畅。

dec list_namme  list    device        Service    Gate
12   food      cookie   200.56.57.58    Shop     123

现在我遇到了问题，我得到了一个包含所有这些列的 csv 文件，但它们没有内容。这里看起来像..

dec list_namme  list    device  Service Gate

一旦代码在它上面运行，它就会创建新的 csv，其中包含预期之外的列。我得到了新的列名称为 index 并且，而不是 3(device service Gate) 列我弄错了 2。

index   Gate

对于包含内容的 csv，我没有遇到任何问题，甚至列都正确。

下面是代码。代码是：

if os.path.isfile(client_csv_file):
         df=pd.read_csv(csv_file) #Read CSV
         df['Gate']=df.Gate.astype(str)
         df = df.groupby(['device', 'Service'])['Gate'].apply(lambda x: ', '.join(set(x))).reset_index()
     df.to_csv(client_out_file, index=False)

请在此代码中帮助我解决此问题。

【问题讨论】：

标签： python python-3.x pandas csv pandas-groupby

【解决方案1】：

对空数据框执行groupby 会导致数据框没有分组键列。

一种解决方案是在执行操作之前测试您的数据框是否为空：

if os.path.isfile(client_csv_file):
    df = pd.read_csv(csv_file)
    if df.empty:
        df = df[['device', 'Service', 'Gate']]
    else:
        df['Gate'] = df.Gate.astype(str)
        df = df.groupby(['device', 'Service'])['Gate']\
               .apply(lambda x: ', '.join(set(x))).reset_index()
    df.to_csv(client_out_file, index=False)

【讨论】：

这似乎是个好主意。会在一段时间内尝试并更新您。