【问题标题】:Value Error Saving nested list to DataFrame - Python, Pandas [duplicate]值错误将嵌套列表保存到 DataFrame - Python,Pandas [重复]
【发布时间】:2019-01-28 12:35:26
【问题描述】:

我正在尝试将嵌套列表发送到数据框,如下所示:

import pandas as pd
import numpy as np

def save_data(data):

    df = pd.DataFrame(data=[data], columns=['Send/Collect', 'Hospital', 'Courier', 'Kit', 'Manufacturer'])


save_data([["One", "Two","Three", "Four", "Five"],
           ["One", "Two","Three", "Four", "Five"],
           ["One", "Two","Three", "Four", "Five"]])

但是,这会导致断言错误如下:

AssertionError: 5 列传递,传递的数据有 3 列

正如在 Gitanother question 上看到的那样,我尝试按照建议将数据放入一个 numpy 数组,但现在返回这个稍微有点混乱的错误:

ValueError: 必须通过二维输入

在实际代码中,列表会随着固定的列大小改变大小,所以我不知道如何解决这个问题!

【问题讨论】:

  • 在创建数据框时不要在data 周围添加[]

标签: python pandas


【解决方案1】:

如果使用嵌套的lists 并且嵌套列表的最大长度与列数相同(此处为 5),则对我来说,从 DataFrame 构造函数中删除 []

def save_data(data):

    df = pd.DataFrame(data=data, columns=['Send/Collect', 'Hospital',
                                          'Courier', 'Kit', 'Manufacturer'])
    return df

L = [["One", "Two","Three", "Four", "Five"],
     ["One", "Two","Three", "Four", "Five"],
     ["One", "Two","Three", "Four", "Five"]]
df = save_data(L)
print (df)
  Send/Collect Hospital Courier   Kit Manufacturer
0          One      Two   Three  Four         Five
1          One      Two   Three  Four         Five
2          One      Two   Three  Four         Five

你也可以创建条件来检查这个:

def save_data(data):
    if max(len(x) for x in L) == 5:
        df = pd.DataFrame(data=data, columns=['Send/Collect', 'Hospital', 'Courier', 
                                              'Kit', 'Manufacturer'])

    return df

L = [["One", "Two","Three", "Four", "Five"],
     ["One", "Two","Three", "Four", "Five"],
     ["One", "Two","Three", "Four"]]

df = save_data(L)
print (df)
  Send/Collect Hospital Courier   Kit Manufacturer
0          One      Two   Three  Four         Five
1          One      Two   Three  Four         Five
2          One      Two   Three  Four         None

【讨论】:

    【解决方案2】:

    您从数据中删除括号,例如。

    def save_data(data):
    
        df = pd.DataFrame(data=data, columns=['Send/Collect',
                                              'Hospital',
                                              'Courier',
                                              'Kit',
                                              'Manufacturer'])
        return df
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-06-27
      • 2022-01-15
      • 2020-10-14
      • 1970-01-01
      • 2017-12-26
      • 2018-08-21
      • 2020-02-25
      • 2018-05-09
      相关资源
      最近更新 更多