【问题标题】:Python: how to avoid loop to convert in a pandas dataframe?Python:如何避免循环在熊猫数据框中进行转换?
【发布时间】:2016-11-14 19:34:11
【问题描述】:
我有以下数据框:
df:
y m d val
0 2013 10 1 33.5
1 2013 10 2 37.1
2 2013 10 3 25.9
3 2013 10 4 31.3
4 2013 10 5 35.3
5 2013 10 6 55.4
6 2013 10 7 29.5
7 2013 10 8 31.3
8 2013 10 9 27.7
9 2013 10 10 25.9
其中y、m、d分别对应年、月、日。我想汇总它们并转换为datetime。
df['date'] = 0
for v in df.index:
df['date'][v] = datetime.datetime(df.y[v], df.m[v], df.d[v])
我想知道哪个是避免该循环的最佳方法
【问题讨论】:
标签:
python
datetime
pandas
dataframe
【解决方案1】:
来自docstring:
Assembling a datetime from multiple columns of a DataFrame. The keys can be
common abbreviations like ['year', 'month', 'day', 'minute', 'second',
'ms', 'us', 'ns']) or plurals of the same
>>> df = pd.DataFrame({'year': [2015, 2016],
'month': [2, 3],
'day': [4, 5]})
>>> pd.to_datetime(df)
0 2015-02-04
1 2016-03-05
dtype: datetime64[ns]
代码:
In [135]: pd.to_datetime(df.rename(columns={'y':'Year','m':'Month','d':'Day'}).iloc[:, :3])
Out[135]:
0 2013-10-01
1 2013-10-02
2 2013-10-03
3 2013-10-04
4 2013-10-05
5 2013-10-06
6 2013-10-07
7 2013-10-08
8 2013-10-09
9 2013-10-10
dtype: datetime64[ns]
【解决方案2】:
这是一种方法:
pd.to_datetime((df['y']*10000 + df['m']*100 + df['d']).astype(str))
Out:
0 2013-10-01
1 2013-10-02
2 2013-10-03
3 2013-10-04
4 2013-10-05
5 2013-10-06
6 2013-10-07
7 2013-10-08
8 2013-10-09
9 2013-10-10
dtype: datetime64[ns]