【问题标题】:Converting a n dimension list to several DataFrames in Python在 Python 中将 n 维列表转换为多个 DataFrame
【发布时间】:2017-12-08 23:06:20
【问题描述】:

我有一个要转换为 DataFrame 的列表数组。例如下面的数组:

[[{'count': 6L, 'eclipse_id': 11348}, {'count': 1L, 'eclipse_id': 11338},
{'count': 1L, 'eclipse_id': 11342}, {'count': 1L, 'eclipse_id': 11361},
{'count': 6L, 'eclipse_id': 11313}],
[[{'count': 1L, 'eclipse_id': 11374},{'count': 1L, 'eclipse_id': 11356},
{'count': 1L, 'eclipse_id': 11358}]]

预期输出

根据列表中的列表会是几个数组:

第一个数组:

    count  eclipse_id
0       6     11348.0
1       1     11338.0
2       1     11342.0
3       1     11361.0
4       6     11313.0

第二个数组:

    count  eclipse_id
0       1     11374.0
1       1     11356.0
2       1     11358.0

如果能排序就更棒了!

我的尝试

这是我尝试过的:

i = 0
for liste in listeGroupContentReactions:
    df_n[i] = pd.DataFrame(liste)
    i+1

但它回答了我ValueError: Wrong number of items passed 2, placement implies 1。完整的错误如下:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-64-851a5559a58d> in <module>()
      6 i = 0
      7 for liste in listeGroupContentReactions:
----> 8     df_n[i] = pd.DataFrame(liste)
      9     i+1
     10 

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/frame.pyc in __setitem__(self, key, value)
   2427         else:
   2428             # set column
-> 2429             self._set_item(key, value)
   2430 
   2431     def _setitem_slice(self, key, value):

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/frame.pyc in _set_item(self, key, value)
   2494         self._ensure_valid_index(value)
   2495         value = self._sanitize_column(key, value)
-> 2496         NDFrame._set_item(self, key, value)
   2497 
   2498         # check if we are modifying a copy

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/generic.pyc in _set_item(self, key, value)
   1646 
   1647     def _set_item(self, key, value):
-> 1648         self._data.set(key, value)
   1649         self._clear_item_cache()
   1650 

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in set(self, item, value, check)
   3716         except KeyError:
   3717             # This item wasn't present, just insert at end
-> 3718             self.insert(len(self.items), item, value)
   3719             return
   3720 

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in insert(self, loc, item, value, allow_duplicates)
   3817 
   3818         block = make_block(values=value, ndim=self.ndim,
-> 3819                            placement=slice(loc, loc + 1))
   3820 
   3821         for blkno, count in _fast_count_smallints(self._blknos[loc:]):

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath)
   2717                      placement=placement, dtype=dtype)
   2718 
-> 2719     return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
   2720 
   2721 # TODO: flexible with index=None and/or items=None

/home/antoine/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, values, placement, ndim, fastpath)
    113             raise ValueError('Wrong number of items passed %d, placement '
    114                              'implies %d' % (len(self.values),
--> 115                                              len(self.mgr_locs)))
    116 
    117     @property

ValueError: Wrong number of items passed 2, placement implies 1

有效的“临床”方法

我试过这样:

df_n_0 = pd.DataFrame(listeGroupContentReactions[0])

它有效,但我怎样才能让它遍历listeGroupContentReactions

【问题讨论】:

    标签: python arrays python-2.7 loops dataframe


    【解决方案1】:

    如果您确定要循环追加数据而不是清除初始列表:

    values = [[{'count': 6L, 'eclipse_id': 11348},
               {'count': 1L, 'eclipse_id': 11338},
               {'count': 1L, 'eclipse_id': 11342},
               {'count': 1L, 'eclipse_id': 11361},
               {'count': 6L, 'eclipse_id': 11313}],
              [{'count': 754L, 'eclipse_id': 15428},
               {'count': 1L, 'eclipse_id': 11258},
               {'count': 1L, 'eclipse_id': 11342}, 
               {'count': 1L, 'eclipse_id': 14233},  
               {'count': 6L, 'eclipse_id': 11313}]]
    
    # This inits a DataFrame with the first value
    df = pd.DataFrame(values[0])
    
    # And this cycles the data appending to the first DataFrame
    for value in values[1:]:
        df = df.append(pd.DataFrame(value), ignore_index=True)
    

    要创建数据框列表,您只需:

    dfs = []
    
    for value in values:
        dfs.append(pd.DataFrame(value))
    

    【讨论】:

    • 我的错!我想知道如何制作values 的几个数据框。抱歉,我已更新问题以显示预期输出。
    • 您的初始数据格式错误,它有一个额外的[
    猜你喜欢
    • 1970-01-01
    • 2022-07-22
    • 1970-01-01
    • 2019-06-22
    • 2021-08-08
    • 2022-01-17
    • 1970-01-01
    • 2016-12-01
    • 1970-01-01
    相关资源
    最近更新 更多