Pandas选择行列的十大技能

感谢博主,转载自:

https://blog.csdn.net/qq_38328378/article/details/81166518

http://www.datastudy.cc/article/ec8c50baa8fd93ea85432eb85fb34eee

技能1、选择一列

技能2、选择多列

技能3、根据一个行索引，选择出一行

技能4、根据一个行序号，选择出从开始到这个序号的行

技能5、根据两个行序号，选择出从第一个序号到第二个序号的行

技能7、根据一个列序号，选择出从开始列到这个序号的所有列

技能8、条件过滤

技能9、根据行字符串索引，进行行选择

技能10、根据行索引/行位置，列名/列位置，进行具体位置的值选

Pandas中行列选择的十大技能

今天，我们来学习一下，Pandas中的关于行列选择的十大技能，这些技能，绝对是你使用Pandas的过程中，需要用到的，因为，你肯定也想像Excel一样，任性地操作Python中的数据框。

Pandas选择行列的十大技能

先来导入我们的演示数据，这里你直接复制执行就可以了。

# import the pandas module import pandas as pd

# Create an example dataframe about a fictional army raw_data = {
    'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons',
    'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'],
    'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'],
    'deaths': [523, 52, 25, 616, 43, 234, 523, 62, 62, 73, 37, 35],
    'battles': [5, 42, 2, 2, 4, 7, 8, 3, 4, 7, 8, 9],
    'size': [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005, 1099, 1523],
    'veterans': [1, 5, 62, 26, 73, 37, 949, 48, 48, 435, 63, 345],
    'readiness': [1, 2, 3, 3, 2, 1, 2, 3, 2, 1, 2, 3],
    'armored': [1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1],
    'deserters': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
    'origin': ['Arizona', 'California', 'Texas', 'Florida', 'Maine', 'Iowa', 'Alaska', 'Washington', 'Oregon', 'Wyoming', 'Louisana', 'Georgia']
} df = pd.DataFrame(
    raw_data, 
    columns = ['regiment', 'company', 'deaths', 'battles', 'size', 'veterans', 'readiness', 'armored', 'deserters', 'origin']

) df = df.set_index('origin') df.head()

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4
California	Nighthawks	1st	52	42	957	5	2	0	24
Texas	Nighthawks	2nd	25	2	1099	62	3	1	31
Florida	Nighthawks	2nd	616	2	1400	26	3	1	2
Maine	Dragoons	1st	43	4	1592	73	2	0	3

技能1、选择一列

df['size']

输出结果:

origin
Arizona       1045 
California     957 
Texas         1099 
Florida       1400 
Maine         1592 
Iowa          1006 
Alaska         987 
Washington     849 
Oregon         973 
Wyoming       1005 
Louisana      1099 
Georgia       1523 
Name: size, dtype: int64

技能2、选择多列

df[['size', 'veterans']]

	size	veterans
origin
Arizona	1045	1
California	957	5
Texas	1099	62
Florida	1400	26
Maine	1592	73
Iowa	1006	37
Alaska	987	949
Washington	849	48
Oregon	973	48
Wyoming	1005	435
Louisana	1099	63
Georgia	1523	345

技能3、根据一个行索引，选择出一行

# Select all rows with the index label "Arizona"     
df.loc[:'Arizona']

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4

技能4、根据一个行序号，选择出从开始到这个序号的行

# Select every row up to 3 
df.iloc[:2]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4
California	Nighthawks	1st	52	42	957	5	2	0	24

技能5、根据两个行序号，选择出从第一个序号到第二个序号的行

    df.iloc[1:2]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
California	Nighthawks	1st	52	42	957	5	2	0	24

技能6、根据一个行序号，选择出从这个行序号开始到结束的行

df.iloc[2:]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Texas	Nighthawks	2nd	25	2	1099	62	3	1	31
Florida	Nighthawks	2nd	616	2	1400	26	3	1	2
Maine	Dragoons	1st	43	4	1592	73	2	0	3
Iowa	Dragoons	1st	234	7	1006	37	1	1	4
Alaska	Dragoons	2nd	523	8	987	949	2	0	24
Washington	Dragoons	2nd	62	3	849	48	3	1	31
Oregon	Scouts	1st	62	4	973	48	2	0	2
Wyoming	Scouts	1st	73	7	1005	435	1	0	3
Louisana	Scouts	2nd	37	8	1099	63	2	1	2
Georgia	Scouts	2nd	35	9	1523	345	3	1	3

技能7、根据一个列序号，选择出从开始列到这个序号的所有列

 # Select the first 2 columns
    df.iloc[:,:2]

	regiment	company
origin
Arizona	Nighthawks	1st
California	Nighthawks	1st
Texas	Nighthawks	2nd
Florida	Nighthawks	2nd
Maine	Dragoons	1st
Iowa	Dragoons	1st
Alaska	Dragoons	2nd
Washington	Dragoons	2nd
Oregon	Scouts	1st
Wyoming	Scouts	1st
Louisana	Scouts	2nd
Georgia	Scouts	2nd

技能8、条件过滤

 # Select rows where df.deaths is greater than 50
    df[df['deaths'] > 50]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4
California	Nighthawks	1st	52	42	957	5	2	0	24
Florida	Nighthawks	2nd	616	2	1400	26	3	1	2
Iowa	Dragoons	1st	234	7	1006	37	1	1	4
Alaska	Dragoons	2nd	523	8	987	949	2	0	24
Washington	Dragoons	2nd	62	3	849	48	3	1	31
Oregon	Scouts	1st	62	4	973	48	2	0	2
Wyoming	Scouts	1st	73	7	1005	435	1	0	3

# Select rows where df.deaths is greater than 500 or less than 50 
df[(df['deaths'] > 500) | (df['deaths'] < 50)]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4
Texas	Nighthawks	2nd	25	2	1099	62	3	1	31
Florida	Nighthawks	2nd	616	2	1400	26	3	1	2
Maine	Dragoons	1st	43	4	1592	73	2	0	3
Alaska	Dragoons	2nd	523	8	987	949	2	0	24
Louisana	Scouts	2nd	37	8	1099	63	2	1	2
Georgia	Scouts	2nd	35	9	1523	345	3	1	3

# Select all the regiments not named "Dragoons" 
df[~(df['regiment'] == 'Dragoons')]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
origin
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4
California	Nighthawks	1st	52	42	957	5	2	0	24
Texas	Nighthawks	2nd	25	2	1099	62	3	1	31
Florida	Nighthawks	2nd	616	2	1400	26	3	1	2
Oregon	Scouts	1st	62	4	973	48	2	0	2
Wyoming	Scouts	1st	73	7	1005	435	1	0	3
Louisana	Scouts	2nd	37	8	1099	63	2	1	2
Georgia	Scouts	2nd	35	9	1523	345	3	1	3

技能9、根据行字符串索引，进行行选择

    # Select the rows called Texas and Arizona
    df.ix[['Arizona', 'Texas']]

	regiment	company	deaths	battles	size	veterans	readiness	armored	deserters
Arizona	Nighthawks	1st	523	5	1045	1	1	1	4
Texas	Nighthawks	2nd	25	2	1099	62	3	1	31

技能10、根据行索引/行位置，列名/列位置，进行具体位置的值选

# Select the third cell in the row named Arizona 
df.ix['Arizona', 'deaths']

523

# Select the third cell in the row named Arizona 
df.ix['Arizona', 2]

523

# Select the third cell down in the column named deaths 
df.ix[2, 'deaths']