python pandas同时选择头部和尾部答案

【问题标题】：python pandas select both head and tailpython pandas同时选择头部和尾部
【发布时间】：2017-07-19 04:50:26
【问题描述】：

对于 Pandas 中的 DataFrame，如何同时选择前 5 个值和后 5 个值？

例如

In [11]: df
Out[11]: 
        A  B  C
2012-11-29  0  0  0
2012-11-30  1  1  1
2012-12-01  2  2  2
2012-12-02  3  3  3
2012-12-03  4  4  4
2012-12-04  5  5  5
2012-12-05  6  6  6
2012-12-06  7  7  7
2012-12-07  8  8  8
2012-12-08  9  9  9

如何显示前两行和后两行？

【问题讨论】：

您的问题没有意义，您声明要选择前 5 个值和后 5 个值，您指的是行还是单个值？请显示所需的输出
除了其他值之外，head 和 tail 可以链接起来（就像在 bash 中一样：）给你一个中间值（df.head(90).tail(10)）来获得值 80 到 90

标签： python pandas slice tail head

【解决方案1】：

您可以将iloc 与numpy.r_ 一起使用：

print (np.r_[0:2, -2:0])
[ 0  1 -2 -1]

df = df.iloc[np.r_[0:2, -2:0]]
print (df)
            A  B  C
2012-11-29  0  0  0
2012-11-30  1  1  1
2012-12-07  8  8  8
2012-12-08  9  9  9

df = df.iloc[np.r_[0:4, -4:0]]
print (df)
            A  B  C
2012-11-29  0  0  0
2012-11-30  1  1  1
2012-12-01  2  2  2
2012-12-02  3  3  3
2012-12-05  6  6  6
2012-12-06  7  7  7
2012-12-07  8  8  8
2012-12-08  9  9  9

【讨论】：

【解决方案2】：

您可以使用df.head(5) 和df.tail(5) 获取前五名和后五名。您可以选择创建新的数据框和append() 头尾：

new_df = df.tail(5)
new_df = new_df.append(df.head(5))

【讨论】：

【解决方案3】：

不是相当同样的问题，但如果您只想显示顶部/底部 5 行（例如在 jupyter 中使用 display 或常规 print，则如果您使用 pd.option_context 上下文，可能会比这更简单。

#make 100 3d random numbers
df = pd.DataFrame(np.random.randn(100,3))

# sort them by their axis sum
df = df.loc[df.sum(axis=1).index]

with pd.option_context('display.max_rows',10):
    print(df)

输出：

           0         1         2
0  -0.649105 -0.413335  0.374872
1   3.390490  0.552708 -1.723864
2  -0.781308 -0.277342 -0.903127
3   0.433665 -1.125215 -0.290228
4  -2.028750 -0.083870 -0.094274
..       ...       ...       ...
95  0.443618 -1.473138  1.132161
96 -1.370215 -0.196425 -0.528401
97  1.062717 -0.997204 -1.666953
98  1.303512  0.699318 -0.863577
99 -0.109340 -1.330882 -1.455040

[100 rows x 3 columns]

【讨论】：

以几种方式扩展了这个答案here...

【解决方案4】：

简单的小功能：

def ends(df, x=5):
    return df.head(x).append(df.tail(x))

并像这样使用：

df = pd.DataFrame(np.random.rand(15,6))
ends(df,2)

这个我其实用的太多了，~~我觉得加到pandas里面会是个很棒的功能。~~（pandas.DataFrame核心API中不会加任何功能）我是在import之后加进去的所以：

import pandas as pd
def ends(df, x=5):
    return df.head(x).append(df.tail(x))
setattr(pd.DataFrame,'ends',ends)

这样使用：

import numpy as np
df = pd.DataFrame(np.random.rand(15,6))
df.ends(2)

【讨论】：

你能把这个提交给pandas git吗？这应该是默认功能

【解决方案5】：

为此，您应该同时使用head() 和tail()。我认为最简单的方法是：

df.head(5).append(df.tail(5))

【讨论】：

【解决方案6】：

在 Jupyter 中，扩展 @bolster 的答案，我们将创建一个可重用的便利函数：

def display_n(df,n): 
    with pd.option_context('display.max_rows',n*2):
        display(df)

然后

display_n(df,2)

返回

         0           1           2
0        0.167961    -0.732745   0.952637
1        -0.050742   -0.421239   0.444715
...      ...         ...         ...
98       0.085264    0.982093    -0.509356
99       -0.758963   -0.578267   -0.115865

（除了格式良好的 HTML 表格）

当df为df = pd.DataFrame(np.random.randn(100,3))时

注意事项：

当然，您可以通过将上面的display 修改为print，将相同的内容打印为文本。
在类 unix 系统上，您可以按照here 的说明将上述功能放在~/.ipython/profile_default/startup 中的py 或ipy 文件中，从而在所有笔记本中自动加载上述功能。

【讨论】：

【解决方案7】：

如果你想只保留 Pandas，你可以使用apply() 连接头部和尾部：

import pandas as pd
from string import ascii_lowercase, ascii_uppercase

df = pd.DataFrame(
    {"upper": list(ascii_uppercase), "lower": list(ascii_lowercase)}, index=range(1, 27)
)

df.apply(lambda x: pd.concat([x.head(2), x.tail(2)]))


   upper lower
1      A     a
2      B     b
25     Y     y
26     Z     z

【讨论】：

【解决方案8】：

与 Linas Fx 相关。

定义如下

pd.DataFrame.less = lambda df, n=10: df.head(n//2).append(df.tail(n//2))

那么你只能输入df.less()

与df.head().append(df.tail())类型相同

如果你输入df.less(2)，结果和df.head(1).append(df.tail(1))一样

【讨论】：