对于给定的样例数据:

python数据清洗(pandas使用)

 

 对其进行缺失值填补、名字切分、删除重复值操作:

import pandas as pd
from pandas import DataFrame,Series
df = DataFrame(pd.read_excel("F:\\python入门\\数据1\\food.xlsx"))
print('原始数据为:\n',df)
#利用均值填充缺失值
df['ounces'].fillna(df['ounces'].mean(),inplace=True)
print('填充均值后的数据:\n',df)
#将food列拆分成两列
df[['first_name','last_name']]=df['food'].str.split(expand=True)
df.drop('food',axis=1,inplace=True)
print('将食物名称拆分后的数据:\n',df)
#删除重复数据
df.drop_duplicates(['first_name','last_name'],inplace=True)
print('删除重复值后的数据:\n',df)
#df.to_excel("F:\\python入门\\数据1\\food_new.xlsx")

结果:

原始数据为:
           food  ounces  animal
0        bacon     4.0     pig
1  pulled pork     3.0     pig
2        bacon     NaN     pig
3     Pastrami     6.0     cow
4  corned beef     7.5     cow
5        Bacon     8.0     pig
6     pastrami    -3.0     cow
7    honey ham     5.0     pig
8     nova lox     6.0  salmon
填充均值后的数据:
           food  ounces  animal
0        bacon  4.0000     pig
1  pulled pork  3.0000     pig
2        bacon  4.5625     pig
3     Pastrami  6.0000     cow
4  corned beef  7.5000     cow
5        Bacon  8.0000     pig
6     pastrami -3.0000     cow
7    honey ham  5.0000     pig
8     nova lox  6.0000  salmon
将食物名称拆分后的数据:
    ounces  animal first_name last_name
0  4.0000     pig      bacon      None
1  3.0000     pig     pulled      pork
2  4.5625     pig      bacon      None
3  6.0000     cow   Pastrami      None
4  7.5000     cow     corned      beef
5  8.0000     pig      Bacon      None
6 -3.0000     cow   pastrami      None
7  5.0000     pig      honey       ham
8  6.0000  salmon       nova       lox
删除重复值后的数据:
    ounces  animal first_name last_name
0     4.0     pig      bacon      None
1     3.0     pig     pulled      pork
3     6.0     cow   Pastrami      None
4     7.5     cow     corned      beef
5     8.0     pig      Bacon      None
6    -3.0     cow   pastrami      None
7     5.0     pig      honey       ham
8     6.0  salmon       nova       lox

python数据清洗(pandas使用)

 

相关文章:

  • 2022-12-23
  • 2021-11-30
  • 2021-09-30
  • 2021-09-12
  • 2021-12-11
  • 2021-12-20
  • 2022-01-05
  • 2022-12-23
猜你喜欢
  • 2021-04-06
  • 2021-11-23
  • 2022-02-16
  • 2022-12-23
  • 2021-12-11
相关资源
相似解决方案