感谢博主,转载自:

https://blog.csdn.net/qq_38328378/article/details/81166518

http://www.datastudy.cc/article/ec8c50baa8fd93ea85432eb85fb34eee

 

目录

    技能1、选择一列

    技能2、选择多列

    技能3、根据一个行索引,选择出一行

    技能4、根据一个行序号,选择出从开始到这个序号的行

    技能5、根据两个行序号,选择出从第一个序号到第二个序号的行

    技能7、根据一个列序号,选择出从开始列到这个序号的所有列

    技能8、条件过滤

    技能9、根据行字符串索引,进行行选择

    技能10、根据行索引/行位置,列名/列位置,进行具体位置的值选


Pandas中行列选择的十大技能

    今天,我们来学习一下,Pandas中的关于行列选择的十大技能,这些技能,绝对是你使用Pandas的过程中,需要用到的,因为,你肯定也想像Excel一样,任性地操作Python中的数据框。

    

Pandas选择行列的十大技能

    

    先来导入我们的演示数据,这里你直接复制执行就可以了。

    

# import the pandas module import pandas as pd

# Create an example dataframe about a fictional army raw_data = {
    'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons',
    'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'],
    'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'],
    'deaths': [523, 52, 25, 616, 43, 234, 523, 62, 62, 73, 37, 35],
    'battles': [5, 42, 2, 2, 4, 7, 8, 3, 4, 7, 8, 9],
    'size': [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005, 1099, 1523],
    'veterans': [1, 5, 62, 26, 73, 37, 949, 48, 48, 435, 63, 345],
    'readiness': [1, 2, 3, 3, 2, 1, 2, 3, 2, 1, 2, 3],
    'armored': [1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1],
    'deserters': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
    'origin': ['Arizona', 'California', 'Texas', 'Florida', 'Maine', 'Iowa', 'Alaska', 'Washington', 'Oregon', 'Wyoming', 'Louisana', 'Georgia']
} df = pd.DataFrame(
    raw_data, 
    columns = ['regiment', 'company', 'deaths', 'battles', 'size', 'veterans', 'readiness', 'armored', 'deserters', 'origin']

) df = df.set_index('origin') df.head()
  regiment company deaths battles size veterans readiness armored deserters
origin                  
Arizona Nighthawks 1st 523 5 1045 1 1 1 4
California Nighthawks 1st 52 42 957 5 2 0 24
Texas Nighthawks 2nd 25 2 1099 62 3 1 31
Florida Nighthawks 2nd 616 2 1400 26 3 1 2
Maine Dragoons 1st 43 4 1592 73 2 0

3

    技能1、选择一列

df['size']

输出结果:

origin
Arizona       1045 
California     957 
Texas         1099 
Florida       1400 
Maine         1592 
Iowa          1006 
Alaska         987 
Washington     849 
Oregon         973 
Wyoming       1005 
Louisana      1099 
Georgia       1523 
Name: size, dtype: int64

    技能2、选择多列

   

df[['size', 'veterans']]
  size veterans
origin    
Arizona 1045 1
California 957 5
Texas 1099 62
Florida 1400 26
Maine 1592 73
Iowa 1006 37
Alaska 987 949
Washington 849 48
Oregon 973 48
Wyoming 1005 435
Louisana 1099 63
Georgia 1523 345

 

    技能3、根据一个行索引,选择出一行

        

# Select all rows with the index label "Arizona"     
df.loc[:'Arizona']
  regiment company deaths battles size veterans readiness armored deserters
origin                  
Arizona Nighthawks 1st 523 5 1045 1 1 1 4

 

    技能4、根据一个行序号,选择出从开始到这个序号的行

# Select every row up to 3 
df.iloc[:2]

 

  regiment company deaths battles size veterans readiness armored deserters
origin                  
Arizona Nighthawks 1st 523 5 1045 1 1 1 4
California Nighthawks 1st 52 42 957 5 2 0 24

    

    技能5、根据两个行序号,选择出从第一个序号到第二个序号的行

    df.iloc[1:2]

 

  regiment company deaths battles size veterans readiness armored deserters
origin                  
California Nighthawks 1st 52 42 957 5 2 0 24

 

    技能6、根据一个行序号,选择出从这个行序号开始到结束的行

df.iloc[2:]
  regiment company deaths battles size veterans readiness armored deserters
origin                  
Texas Nighthawks 2nd 25 2 1099 62 3 1 31
Florida Nighthawks 2nd 616 2 1400 26 3 1 2
Maine Dragoons 1st 43 4 1592 73 2 0 3
Iowa Dragoons 1st 234 7 1006 37 1 1 4
Alaska Dragoons 2nd 523 8 987 949 2 0 24
Washington Dragoons 2nd 62 3 849 48 3 1 31
Oregon Scouts 1st 62 4 973 48 2 0 2
Wyoming Scouts 1st 73 7 1005 435 1 0 3
Louisana Scouts 2nd 37 8 1099 63 2 1 2
Georgia Scouts 2nd 35 9 1523 345 3 1 3

 

    技能7、根据一个列序号,选择出从开始列到这个序号的所有列

    

 # Select the first 2 columns
    df.iloc[:,:2]
  regiment company
origin    
Arizona Nighthawks 1st
California Nighthawks 1st
Texas Nighthawks 2nd
Florida Nighthawks 2nd
Maine Dragoons 1st
Iowa Dragoons 1st
Alaska Dragoons 2nd
Washington Dragoons 2nd
Oregon Scouts 1st
Wyoming Scouts 1st
Louisana Scouts 2nd
Georgia Scouts 2nd

 

    技能8、条件过滤

 # Select rows where df.deaths is greater than 50
    df[df['deaths'] > 50]
  regiment company deaths battles size veterans readiness armored deserters
origin                  
Arizona Nighthawks 1st 523 5 1045 1 1 1 4
California Nighthawks 1st 52 42 957 5 2 0 24
Florida Nighthawks 2nd 616 2 1400 26 3 1 2
Iowa Dragoons 1st 234 7 1006 37 1 1 4
Alaska Dragoons 2nd 523 8 987 949 2 0 24
Washington Dragoons 2nd 62 3 849 48 3 1 31
Oregon Scouts 1st 62 4 973 48 2 0 2
Wyoming Scouts 1st 73 7 1005 435 1 0 3
# Select rows where df.deaths is greater than 500 or less than 50 
df[(df['deaths'] > 500) | (df['deaths'] < 50)]
  regiment company deaths battles size veterans readiness armored deserters
origin                  
Arizona Nighthawks 1st 523 5 1045 1 1 1 4
Texas Nighthawks 2nd 25 2 1099 62 3 1 31
Florida Nighthawks 2nd 616 2 1400 26 3 1 2
Maine Dragoons 1st 43 4 1592 73 2 0 3
Alaska Dragoons 2nd 523 8 987 949 2 0 24
Louisana Scouts 2nd 37 8 1099 63 2 1 2
Georgia Scouts 2nd 35 9 1523 345 3 1 3
# Select all the regiments not named "Dragoons" 
df[~(df['regiment'] == 'Dragoons')]
  regiment company deaths battles size veterans readiness armored deserters
origin                  
Arizona Nighthawks 1st 523 5 1045 1 1 1 4
California Nighthawks 1st 52 42 957 5 2 0 24
Texas Nighthawks 2nd 25 2 1099 62 3 1 31
Florida Nighthawks 2nd 616 2 1400 26 3 1 2
Oregon Scouts 1st 62 4 973 48 2 0 2
Wyoming Scouts 1st 73 7 1005 435 1 0 3
Louisana Scouts 2nd 37 8 1099 63 2 1 2
Georgia Scouts 2nd 35 9 1523 345 3 1 3

 

    技能9、根据行字符串索引,进行行选择

    # Select the rows called Texas and Arizona
    df.ix[['Arizona', 'Texas']]

 

  regiment company deaths battles size veterans readiness armored deserters
Arizona Nighthawks 1st 523 5 1045 1 1 1 4
Texas Nighthawks 2nd 25 2 1099 62 3 1 31

    技能10、根据行索引/行位置,列名/列位置,进行具体位置的值选

# Select the third cell in the row named Arizona 
df.ix['Arizona', 'deaths']

523

 

# Select the third cell in the row named Arizona 
df.ix['Arizona', 2]

 

523

# Select the third cell down in the column named deaths 
df.ix[2, 'deaths']

25

相关文章:

  • 2023-02-23
  • 2022-12-23
  • 2022-12-23
  • 2021-08-28
  • 2021-12-10
  • 2021-09-23
  • 2021-05-18
  • 2022-01-18
猜你喜欢
  • 2021-11-06
  • 2022-12-23
  • 2021-07-09
  • 2021-10-13
  • 2022-12-23
  • 2021-07-19
  • 2023-02-23
相关资源
相似解决方案