【问题标题】:Converting data types on python data frame在python数据框中转换数据类型
【发布时间】:2022-01-02 06:34:12
【问题描述】:
House Number Street First Name Surname Age Relationship to Head of House Marital Status Gender Occupation Infirmity Religion
0 1 Smith Radial Grace Patel 46 Head Widowed Female Petroleum engineer None Catholic
1 1 Smith Radial Ian Nixon 24 Lodger Single Male Publishing rights manager None Christian
2 2 Smith Radial Frederick Read 87 Head Divorced Male Retired TEFL teacher None Catholic
3 3 Smith Radial Daniel Adams 58 Head Divorced Male Therapist, music None Catholic
4 3 Smith Radial Matthew Hall 13 Grandson NaN Male Student None NaN
5 3 Smith Radial Steven Fletcher 9 Grandson NaN Male Student None NaN
6 4 Smith Radial Alison Jenkins 38 Head Single Female Physiotherapist None Catholic
7 4 Smith Radial Kelly Jenkins 12 Daughter NaN Female Student None NaN
8 5 Smith Radial Kim Browne 69 Head Married Female Retired Estate manager/land agent None Christian
9 5 Smith Radial Oliver Browne 69 Husband Married Male Retired Merchandiser, retail None None

我有一个数据集,您可以从问题的侧面看到它。我想将所有这些数据集从对象转换为整数和字符串。

df = pd.read_csv('user-data.csv')
df[['Street','Relationship to Head of House','Marital Status','Gender','Occupation','Infirmity','Religion']] = df[['Street','Relationship to Head of House','Marital Status','Gender','Occupation','Infirmity','Religion']].astype('str') 
df[['House Number','Age']] = df[['House Number','Age']].astype('int') 

我尝试了两种不同的方法,但在该操作之后所有数据集都消失了。

df = df['Street'].astype(str)
df = df['Relationship to Head of House'].astype(str)
df = df['Marital Status'].astype(str)
df = df['Gender'].astype(str)
df = df['Occupation'].astype(str)
df = df['Infirmity'].astype(str)
df = df['Religion'].astype(str)
df = df['Gender'].astype(str)

您能帮我转换列吗? 谢谢

我仍然得到与以下相同的类型:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10610 entries, 0 to 10609
Data columns (total 11 columns):
 #   Column                         Non-Null Count  Dtype 
---  ------                         --------------  ----- 
 0   House Number                   10610 non-null  int64 
 1   Street                         10610 non-null  object
 2   First Name                     10610 non-null  object
 3   Surname                        10610 non-null  object
 4   Age                            10610 non-null  object
 5   Relationship to Head of House  10610 non-null  object
 6   Marital Status                 7995 non-null   object
 7   Gender                         10610 non-null  object
 8   Occupation                     10610 non-null  object
 9   Infirmity                      10610 non-null  object
 10  Religion                       7928 non-null   object
dtypes: int64(1), object(10)
memory usage: 911.9+ KB

Object 而不是 int 或 string,你能帮我解决这个问题吗?

【问题讨论】:

  • 字符串列是对象类型,这是正常的
  • 我教过它应该写成字符串而不是对象

标签: python pandas dataframe csv type-conversion


【解决方案1】:

你需要赋值左侧的 df['Street']= df['Street'].astype(str)

df['Street']= df['Street'].astype(str)
df['Relationship to Head of House'] = df['Relationship to Head of House'].astype(str)
df['Marital Status'] = df['Marital Status'].astype(str)
df['Gender'] = df['Gender'].astype(str)
df['Occupation'] = df['Occupation'].astype(str)
df['Infirmity'] = df['Infirmity'].astype(str)
df['Religion'] = df['Religion'].astype(str)
df['Gender'] = df['Gender'].astype(str)

columns=df.columns
for column in columns:
    df[column]=df[column].astype(str)

in the pd.read_csv you can set the dtypes=[str,str,...] for each column

 numeric_df=df.select_dtypes(exclude='object')

 returns the a dataframe with non-numeric columns

 columns=numeric_df.columns

【讨论】:

  • 我可以对所有列都这样做吗?
  • 是的,先生,所有专栏
  • 你能检查一下新编辑的问题吗?
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-10-20
  • 1970-01-01
  • 2020-10-08
  • 1970-01-01
  • 2019-01-11
  • 2022-01-20
相关资源
最近更新 更多