【问题标题】:Appending a level (with fixed value) to pandas Series/DataFrame将级别(具有固定值)附加到 pandas Series/DataFrame
【发布时间】:2017-03-30 21:20:39
【问题描述】:

我有一个多索引的熊猫系列如下:

category_1  number
A           0         1.764052
            1         0.400157
            2         0.978738
            3         2.240893
            4         1.867558
C           0        -0.977278
            1         0.950088
            2        -0.151357
            3        -0.103219
            4         0.410599

它是从这段代码生成的:

import pandas as pd
import numpy as np
idx = pd.MultiIndex.from_product([['A','C'],range(5)], names=['category_1','number'])
np.random.seed(0)
s = pd.Series(index=idx, data = np.random.randn(len(idx)))

我想在具有固定值(即D)的索引中添加另一个级别,称为category_2,以获得以下结果:

category_1  category_2  number
A           D           0         1.764052
                        1         0.400157
                        2         0.978738
                        3         2.240893
                        4         1.867558
C           D           0        -0.977278
                        1         0.950088
                        2        -0.151357
                        3        -0.103219
                        4         0.410599

我一直在使用这种 hacky 方式来做到这一点:

df =s.to_frame('dummy')
df['category_2'] = 'D'
df.set_index('category_2', append = True, inplace = True)
df = df.reorder_levels([0,2,1])
res = df['dummy']

有没有更好(更简洁/pythonic)的方法来为 pandas Series/DataFrame 上的现有关卡添加具有固定值的关卡?

【问题讨论】:

    标签: python pandas dataframe multi-index


    【解决方案1】:

    您需要创建新的MultiIndex,然后替换旧的:

    #change multiindex
    new_index = list(zip(s.index.get_level_values('category_1'), 
                         ['D'] * len(s.index), 
                         s.index.get_level_values('number')))
    print (new_index)
    [('A', 'D', 0), ('A', 'D', 1),
     ('A', 'D', 2), ('A', 'D', 3), 
     ('A', 'D', 4), ('C', 'D', 0), 
     ('C', 'D', 1), ('C', 'D', 2), 
     ('C', 'D', 3), ('C', 'D', 4)]
    s.index = pd.MultiIndex.from_tuples(new_index, 
                                        names=['category_1','category_2','number'])
    print (s)
    category_1  category_2  number
    A           D           0         1.764052
                            1         0.400157
                            2         0.978738
                            3         2.240893
                            4         1.867558
    C           D           0        -0.977278
                            1         0.950088
                            2        -0.151357
                            3        -0.103219
                            4         0.410599
    dtype: float64
    

    MultiIndex.from_product 的另一个不错的解决方案 - comment 有点改变:

    s.index = pd.MultiIndex.from_product([s.index.levels[0], 
                                          ['D'], 
                                          s.index.levels[1]], names= ['c1','c2','number']) 
    print (s)
    c1  c2  number
    A   D   0         1.764052
            1         0.400157
            2         0.978738
            3         2.240893
            4         1.867558
    C   D   0        -0.977278
            1         0.950088
            2        -0.151357
            3        -0.103219
            4         0.410599
    dtype: float64
    

    或者:

    s.index = pd.MultiIndex.from_product([s.index.get_level_values('category_1').unique(), 
                                          ['D'],  
                                          s.index.get_level_values('number').unique()], 
                                         names= ['c1','c2','number']) 
    print (s)
    c1  c2  number
    A   D   0         1.764052
            1         0.400157
            2         0.978738
            3         2.240893
            4         1.867558
    C   D   0        -0.977278
            1         0.950088
            2        -0.151357
            3        -0.103219
            4         0.410599
    dtype: float64
    

    【讨论】:

    • 谢谢,另一种方法是使用 from_product: s.index = pd.MultiIndex.from_product([s.index.levels[0], 'D', s.index.levels[1]] , 名称= ['c1','c2','number'])
    • 很好,我的作品s.index = pd.MultiIndex.from_product([s.index.levels[0], ['D'], s.index.levels[1]], names= ['c1','c2','number'])
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2022-01-23
    • 2018-11-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-11-20
    • 1970-01-01
    相关资源
    最近更新 更多