【发布时间】:2021-05-07 15:28:34
【问题描述】:
我有一个包含两列的数据集
df = pd.DataFrame({'Date': [195101, 195102, 195103, 195104, 195105],
'Value': [1.5, 0.9, -0.1, -0.3, -0.7]})
Date Value
0 195101 1.5
1 195102 0.9
2 195103 -0.1
3 195104 -0.3
4 195105 -0.7
检查类型后
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 5 non-null int64
1 Value 5 non-null float64
dtypes: float64(1), int64(1)
memory usage: 208.0 bytes
'Date' 是 int 类型。尝试将其转换为日期时间后
df['Date'] = pd.to_datetime(df['Date'])
结果是这样的:
Date Value
0 1970-01-01 00:00:00.000195101 1.5
1 1970-01-01 00:00:00.000195102 0.9
2 1970-01-01 00:00:00.000195103 -0.1
3 1970-01-01 00:00:00.000195104 -0.3
4 1970-01-01 00:00:00.000195105 -0.7
相反,我想获得年月格式
Date Value
0 1951-01 1.5
1 1951-02 0.9
2 1951-03 -0.1
3 1951-04 -0.3
4 1951-05 -0.7
问题从下面的答案(已接受)解决,其中:
df['Date'] = pd.to_datetime(df.Date.astype(str), format='%Y%m').dt.to_period('M')
【问题讨论】:
标签: pandas dataframe jupyter-notebook