DataFrame类具有很多方法,下面做用法的介绍和举例。
pandas.DataFrame学习系列2——函数方法(1)
1.abs(),返回DataFrame每个数值的绝对值,前提是所有元素均为数值型
1 import pandas as pd 2 import numpy as np 3 4 df=pd.read_excel('南京银行.xlsx',index_col='Date') 5 df1=df[:5] 6 df1.iat[0,1]=-df1.iat[0,1] 7 df1 8 Open High Low Close Turnover Volume 9 Date 10 2017-09-15 8.06 -8.08 8.03 8.04 195.43 24272800 11 2017-09-18 8.05 8.13 8.03 8.06 200.76 24867600 12 2017-09-19 8.03 8.06 7.94 8.00 433.76 54253100 13 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 14 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600 15 16 df1.abs() 17 Open High Low Close Turnover Volume 18 Date 19 2017-09-15 8.06 8.08 8.03 8.04 195.43 24272800.0 20 2017-09-18 8.05 8.13 8.03 8.06 200.76 24867600.0 21 2017-09-19 8.03 8.06 7.94 8.00 433.76 54253100.0 22 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700.0 23 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600.0
2.add(other, axis='columns', level=None, fill_value=None) 将某个序列或表中的元素与本表中的元素相加,默认匹配列元素
1 ar1=[8.1,8.2,8.0,8.15,200.00,32000000] 2 cl1=['Open','High','Low','Close','Turnover','Volume'] 3 se1=pd.Series(data=ar1,index=cl1) 4 se1 5 6 Open 8.10 7 High 8.20 8 Low 8.00 9 Close 8.15 10 Trunover 200.00 11 Volume 32000000.00 12 dtype: float64 13 14 df1.add(se1) 15 Open High Low Close Turnover Volume 16 Date 17 2017-09-15 16.16 0.12 16.03 16.19 395.43 56272800.0 18 2017-09-18 16.15 16.33 16.03 16.21 400.76 56867600.0 19 2017-09-19 16.13 16.26 15.94 16.15 633.76 86253100.0 20 2017-09-20 16.07 16.26 15.95 16.18 519.94 71909700.0 21 2017-09-21 16.12 16.30 15.99 16.19 441.94 62056600.0
1 df1.add(df1) 2 3 Open High Low Close Turnover Volume 4 Date 5 2017-09-15 16.12 -16.16 16.06 16.08 390.86 48545600 6 2017-09-18 16.10 16.26 16.06 16.12 401.52 49735200 7 2017-09-19 16.06 16.12 15.88 16.00 867.52 108506200 8 2017-09-20 15.94 16.12 15.90 16.06 639.88 79819400 9 2017-09-21 16.04 16.20 15.98 16.08 483.88 60113200
3.add_prefix()和add_suffix()为列名添加前缀或后缀
1 df1.add_prefix('list') 2 3 listOpen listHigh listLow listClose listTurnover listVolume 4 Date 5 2017-09-15 8.06 8.08 8.03 8.04 195.43 24272800 6 2017-09-18 8.05 8.13 8.03 8.06 200.76 24867600 7 2017-09-19 8.03 8.06 7.94 8.00 433.76 54253100 8 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 9 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600 10 11 df1.add_suffix('list') 12 13 Openlist Highlist Lowlist Closelist Turnoverlist Volumelist 14 Date 15 2017-09-15 8.06 8.08 8.03 8.04 195.43 24272800 16 2017-09-18 8.05 8.13 8.03 8.06 200.76 24867600 17 2017-09-19 8.03 8.06 7.94 8.00 433.76 54253100 18 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 19 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600
4.agg(func, axis=0, *args, **kwargs),合计运算,常用的函数有min,max,prod,mean,std,var,median等
1 所有列只做一种运算 2 df1.agg(sum) 3 Open 4.013000e+01 4 High 4.043000e+01 5 Low 3.994000e+01 6 Close 4.017000e+01 7 Turnover 1.391830e+03 8 Volume 1.733598e+08 9 dtype: float64 10 11 所有列做两种运算 12 df1.agg(['sum','min']) 13 Open High Low Close Turnover Volume 14 sum 40.13 40.43 39.94 40.17 1391.83 173359800 15 min 7.97 8.06 7.94 8.00 195.43 24272800 16 17 不同列做不同运算 18 df1.agg({'Open':['sum','min'],'Close':['sum','max']}) 19 Close Open 20 max 8.06 NaN 21 min NaN 7.97 22 sum 40.17 40.13
5.align(),DataFrame与Series或DataFrame之间连接运算,常用的有内联,外联,左联,右联
1 df2=df[3:5] 2 df2 3 Out[68]: 4 Open High Low Close Turnover Volume 5 Date 6 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 7 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600 8 9 df1.align(df2,join='inner') #返回的为元组类型对象 10 ( Open High Low Close Turnover Volume 11 Date 12 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 13 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600, 14 Open High Low Close Turnover Volume 15 Date 16 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 17 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600) 18 19 df1.align(df2,join='left') 20 Out[69]: 21 ( Open High Low Close Turnover Volume 22 Date 23 2017-09-15 8.06 8.08 8.03 8.04 195.43 24272800 24 2017-09-18 8.05 8.13 8.03 8.06 200.76 24867600 25 2017-09-19 8.03 8.06 7.94 8.00 433.76 54253100 26 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 27 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600, 28 Open High Low Close Turnover Volume 29 Date 30 2017-09-15 NaN NaN NaN NaN NaN NaN 31 2017-09-18 NaN NaN NaN NaN NaN NaN 32 2017-09-19 NaN NaN NaN NaN NaN NaN 33 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700.0 34 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600.0) 35 36 df1.align(df2,join='left')[0] 37 Out[70]: 38 Open High Low Close Turnover Volume 39 Date 40 2017-09-15 8.06 8.08 8.03 8.04 195.43 24272800 41 2017-09-18 8.05 8.13 8.03 8.06 200.76 24867600 42 2017-09-19 8.03 8.06 7.94 8.00 433.76 54253100 43 2017-09-20 7.97 8.06 7.95 8.03 319.94 39909700 44 2017-09-21 8.02 8.10 7.99 8.04 241.94 30056600