【问题标题】:create pandas dataframe from list of tuples从元组列表创建熊猫数据框
【发布时间】:2016-03-14 11:20:57
【问题描述】:

我看到了几个类似的主题,但它们并没有真正帮助我,因此有了新帖子。

我想从元组列表中创建下面的 df:

Values         Total  extra
label                      
Pictionary  0.000000     12
Chess       4.609929     12
Cluedo      8.421986     12

以下是实现它的所有组件:

columns = ['Total, 'extra']

tups = [(u'Pictionary', 0.0, 12)
        (u'Chess', 4.6099290780141837, 12)
        (u'Cluedo', 8.4219858156028362, 12)]

我的失败尝试:

pd.DataFrame(tups, columns=columns)

错误信息:

AssertionError: 2 columns passed, passed data had 3 columns

【问题讨论】:

  • 为什么不简单地pd.DataFrame(tups, columns=['Values']+columns)

标签: pandas tuples dataframe


【解决方案1】:

您可以使用pandas.DataFrame.from_records()

import pandas as pd

data = [(1,2,3),
        (4,5,6),
        (7,8,9)]

col_names = ['Col0', 'Col1', 'Col2']
row_names = ['Row0', 'Row1', 'Row2']

df = pd.DataFrame.from_records(data, columns=col_names, index=row_names)

print(df)

      Col0  Col1  Col2
Row0     1     2     3
Row1     4     5     6
Row2     7     8     9

【讨论】:

    【解决方案2】:

    我认为您必须在 list 列中添加一个值,然后尝试 list comprehension 和第一列 set_index,如果需要第一列为 index

    import pandas as pd
    
    columns = ['label', 'Total', 'extra']
    
    tups = [(u'Pictionary', 0.0, 12),
            (u'Chess', 4.6099290780141837, 12),
            (u'Cluedo', 8.4219858156028362, 12)]
    
    df = pd.DataFrame([x for x in tups], columns=columns)
    
    print df
            label     Total  extra
    0  Pictionary  0.000000     12
    1       Chess  4.609929     12
    2      Cluedo  8.421986     12
    
    df = df.set_index('label')
    #if you need set column name
    df.columns.name = 'Values'
    
    print df
    Values         Total  extra
    label                      
    Pictionary  0.000000     12
    Chess       4.609929     12
    Cluedo      8.421986     12
    

    或者您可以使用comment of Colonel Beauvel 的解决方案:

    import pandas as pd
    
    columns = ['Total', 'extra']
    
    tups = [(u'Pictionary', 0.0, 12),
            (u'Chess', 4.6099290780141837, 12),
            (u'Cluedo', 8.4219858156028362, 12)]
    
    df = pd.DataFrame(tups, columns=['label']+columns)
    print df
            label     Total  extra
    0  Pictionary  0.000000     12
    1       Chess  4.609929     12
    2      Cluedo  8.421986     12
    
    df = df.set_index('label')
    df.columns.name = 'Values'
    print df
    Values         Total  extra
    label                      
    Pictionary  0.000000     12
    Chess       4.609929     12
    Cluedo      8.421986     12
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-01-14
      • 2021-02-08
      • 2021-06-08
      • 2017-08-27
      • 2021-09-29
      • 2020-12-20
      • 1970-01-01
      • 2020-09-21
      相关资源
      最近更新 更多