部分取消pandas DataFrame答案

【问题标题】：Unpivot pandas DataFrame partly部分取消pandas DataFrame
【发布时间】：2021-11-15 12:36:50
【问题描述】：

我有下表DataFrame

Items	Description	Store 1 Qty	Store 1 Value	Store 2 Qty	Store 2 Value
item 1	Some item name	5	120	7	240
item 2	Some other item	9	1234	12	98

在 Python 上是否有任何简单的方法可以仅取消透视存储？这样做：

Items	Description	Store number	Value	Qty
Item 1	Some item name	Store 1	5	120

我正在考虑完全取消旋转它，然后将其重新旋转以保持价值和数量作为价值。但我认为，应该有一些更有效的解决方案

【问题讨论】：

标签： python pandas dataframe pivot-table

【解决方案1】：

如果需要Qty, Value 分隔列，则将第一列转换为MultiIndex，因此可以使用Series.str.rsplit by last space to MultiIndex in columns，所以最后通过DataFrame.stack 重塑：

df = df.set_index(['Items','Description'])
df.columns = df.columns.str.rsplit(n=1, expand=True)
df = df.rename_axis(('Store number',None), axis=1).stack(0).reset_index()
print (df)
    Items      Description Store number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 1   Some item name      Store 2    7    240
2  item 2  Some other item      Store 1    9   1234
3  item 2  Some other item      Store 2   12     98

【讨论】：

【解决方案2】：

您可以将pd.melt() 与pd.pivot 一起使用，如下所示：

df_ = pd.melt(df, id_vars=['Items', 'Description'])
df_[['Store number', 'Type']] = df_['variable'].str.rsplit(n=1, expand=True)
df_ = df_.pivot(index=['Items', 'Description', 'Store number'], columns='Type', values='value').reset_index()
print(df_)
Type   Items      Description Store number  Qty  Value
0     item 1   Some item name      Store 1    5    120
1     item 1   Some item name      Store 2    7    240
2     item 2  Some other item      Store 1    9   1234
3     item 2  Some other item      Store 2   12     98

【讨论】：

【解决方案3】：

一个选项是pd.wide_to_long；首先必须对列进行重新排序，使Qty 和Value 出现在最前面：

columns = df.columns.str.rsplit(n=1).str[::-1].str.join(' ')
temp = df.set_axis(columns, axis = 'columns')
(pd.wide_to_long(temp, 
                 stubnames = ['Qty', 'Value'], 
                 i = ['Items', 'Description'], 
                 j = 'Store Number', 
                 sep = ' ', 
                 suffix='.+')
   .reset_index()
)
    Items      Description Store Number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 1   Some item name      Store 2    7    240
2  item 2  Some other item      Store 1    9   1234
3  item 2  Some other item      Store 2   12     98

以下是另一种可能的选择：

#pip install git+https://github.com/pyjanitor-devs/pyjanitor.git
import pandas as pd
import janitor
df.pivot_longer(index = ['Items', 'Description'], 
                names_to = ('Store Number', '.value'), 
                names_pattern = r"(.+\s\d)\s(.+)")

    Items      Description Store Number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 2  Some other item      Store 1    9   1234
2  item 1   Some item name      Store 2    7    240
3  item 2  Some other item      Store 2   12     98

这使用来自pyjanitor 的pivot_longer 函数。

说明：您希望重塑的列有一个模式（商店编号后跟数量或价值）；我们在 names_pattern 中利用了这一点，并带有组的正则表达式 (r"(.+\s\d)\s(.+)") - 第一组指向 Store 1/Store2，而另一组指向 Qty/Value。

names_to 参数指定新数据帧的外观 - 对于这种特定情况，.value 告诉函数将与其关联的列的部分保留为标题 - 在这种情况下，.value 是第二个条目，因此它与names_pattern中的第二组配对； names_to 中的 store number 与 names_pattern 中的第一个条目配对

【讨论】：