【发布时间】:2014-06-19 13:41:22
【问题描述】:
转自https://groups.google.com/forum/#!topic/pydata/5mhuatNAl5g
似乎在从结构化数组创建 DataFrame 时复制了数据? 如果数据是 numpy 数组的字典,我会得到类似的结果。
是否可以在不进行任何复制或检查的情况下从结构化数组或类似数组中创建 DataFrame?
In [44]: sarray = randn(1e7,10).view([(name, float) for name in 'abcdefghij']).squeeze()
In [45]: for N in [10,100,1000,10000,100000,1000000,10000000]:
...: s = sarray[:N]
...: %timeit z = pd.DataFrame(s)
...:
1000 loops, best of 3: 830 µs per loop
1000 loops, best of 3: 834 µs per loop
1000 loops, best of 3: 872 µs per loop
1000 loops, best of 3: 1.33 ms per loop
100 loops, best of 3: 15.4 ms per loop
10 loops, best of 3: 161 ms per loop
1 loops, best of 3: 1.45 s per loop
谢谢, 戴夫
【问题讨论】: