Pandas系列（十三）-其他常用功能

一、统计数据频率

Pandas系列（十三）-其他常用功能

　1. values_counts

pd.value_counts(df.column_name)
df.column_name.value_counts()

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)[source]
Return a Series containing counts of unique values.

　　参数详解

normalize : boolean, default False
If True then the object returned will contain the relative frequencies of the unique values.

sort : boolean, default True
Sort by values.

ascending : boolean, default False
Sort in ascending order.

bins : integer, optional
Rather than count values, group them into half-open bins, a convenience for pd.cut, only works with numeric data.

dropna : boolean, default True
Don’t include counts of NaN.

　　参数示例讲解

index = pd.Index([3, 1, 2, 3, 4, np.nan])
index.value_counts()
Out[144]: 
3.0    2
4.0    1
2.0    1
1.0    1
dtype: int64
index.value_counts(normalize=True)
Out[145]: 
3.0    0.4
4.0    0.2
2.0    0.2
1.0    0.2
dtype: float64
index.value_counts(bins=3)
Out[146]: 
(2.0, 3.0]      2
(0.996, 2.0]    2
(3.0, 4.0]      1
dtype: int64
index.value_counts(dropna=False)
Out[148]: 
 3.0    2
NaN     1
 4.0    1
 2.0    1
 1.0    1
dtype: int64

In [21]:  data=pd.DataFrame(pd.Series([1,2,3,4,5,6,11,1,1,1,1,2,2,2,2,3]).values.reshape(4,4),columns=['a','b','c','d'])

In [22]: data
Out[22]: 
   a  b   c  d
0  1  2   3  4
1  5  6  11  1
2  1  1   1  2
3  2  2   2  3

In [23]: pd.value_counts(data.a)
Out[23]: 
1    2
2    1
5    1
Name: a, dtype: int64

In [26]: pd.value_counts(data.a).sort_index()
Out[26]: 
1    2
2    1
5    1
Name: a, dtype: int64

In [27]: pd.value_counts(data.a).sort_index().index
Out[27]: Int64Index([1, 2, 5], dtype='int64')

In [28]: pd.value_counts(data.a).sort_index().values
Out[28]: array([2, 1, 1], dtype=int64)

values_count实例