【发布时间】:2015-07-08 14:31:16
【问题描述】:
下面的代码生成一个df:
import pandas as pd
from datetime import datetime as dt
import numpy as np
dates = [dt(2014, 1, 2, 2), dt(2014, 1, 2, 3), dt(2014, 1, 2, 4), None]
strings1 = ['A', 'B',None, 'C']
strings2 = [None, 'B','C', 'C']
strings3 = ['A', 'B','C', None]
vals = [1.,2.,np.nan, 4.]
df = pd.DataFrame(dict(zip(['A','B','C','D','E'],
[strings1, dates, strings2, strings3, vals])))
+---+------+---------------------+------+------+-----+
| | A | B | C | D | E |
+---+------+---------------------+------+------+-----+
| 0 | A | 2014-01-02 02:00:00 | None | A | 1 |
| 1 | B | 2014-01-02 03:00:00 | B | B | 2 |
| 2 | None | 2014-01-02 04:00:00 | C | C | NaN |
| 3 | C | NaT | C | None | 4 |
+---+------+---------------------+------+------+-----+
我想用''(空字符串)替换所有None(python中的真正None,而不是str)。
预期 df是
+---+---+---------------------+---+---+-----+
| | A | B | C | D | E |
+---+---+---------------------+---+---+-----+
| 0 | A | 2014-01-02 02:00:00 | | A | 1 |
| 1 | B | 2014-01-02 03:00:00 | B | B | 2 |
| 2 | | 2014-01-02 04:00:00 | C | C | NaN |
| 3 | C | NaT | C | | 4 |
+---+---+---------------------+---+---+-----+
我做的是
df = df.replace([None], [''], regex=True)
但我得到了
+---+---+---------------------+---+------+---+
| | A | B | C | D | E |
+---+---+---------------------+---+------+---+
| 0 | A | 1388628000000000000 | | A | 1 |
| 1 | B | 1388631600000000000 | B | B | 2 |
| 2 | | 1388635200000000000 | C | C | |
| 3 | C | | C | | 4 |
+---+---+---------------------+---+------+---+
- 所有日期都变成大数字
- 连
NaT和NaN都被替换了,我不希望这样。
我怎样才能正确有效地做到这一点?
【问题讨论】: