【发布时间】:2016-12-21 07:39:00
【问题描述】:
我有一段代码可以单独运行,但是当我将它放在循环中(或使用df.apply() 方法)时,它不起作用。
代码是:
import pandas as pd
from functools import partial
datadf=pd.DataFrame(data,columns=['X1','X2'])
for i in datadf.index.values.tolist():
row=datadf.loc[i]
x1=row['X1']
x2=row['X2']
set1=set([x1,x2])
links=data2[data2['Xset']==set1]
df1=pd.DataFrame(range(1,11),columns=['year'])
def idlist1(row,var1):
year=row['year']
id1a=links[(links['xx1']==var1) & (links['year']==year)]
id1a=id1a['id1'].values.tolist()
id1b=links[(links['xx2']==var1) & (links['year']==year)]
id1b=id1b['id2'].values.tolist()
id1=list(set(id1a+id1b))
return id1
df1['id1a']=df1.apply(partial(idlist1,var1=x1),axis=1)
#...(do other stuffs to return a value using "df1")
del df1
这里的data2 是另一个数据框。在这里,我试图将(x1,x2) 的值与data2 匹配。
代码在循环之外工作正常,我的意思是,我直接指定(x1,x2)。但是当我将代码放入循环或使用df.apply 时,我总是收到错误消息
ValueError: could not broadcast input array from shape (0) into shape (1)
我不明白为什么。有人可以帮忙吗?谢谢!
(顺便说一句,pandas 的版本是0.18.0。)
完整的错误信息是:
File "<ipython-input-229-541c0f3a4d2f>", line 19, in <module>
df1['id1a']=df1.apply(partial(idlist1,var1=x1),axis=1)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 4042, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 4155, in _apply_standard
result = self._constructor(data=results, index=index)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 223, in __init__
mgr = self._init_dict(data, index, columns, dtype=dtype)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 359, in _init_dict
return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 5250, in _arrays_to_mgr
return create_block_manager_from_arrays(arrays, arr_names, axes)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 3933, in create_block_manager_from_arrays
construction_error(len(arrays), arrays[0].shape, axes, e)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 3895, in construction_error
raise e
ValueError: could not broadcast input array from shape (0) into shape (1)
更新:我发现df.apply方法与循环不兼容,所以我将循环中的所有apply转换为循环,代码现在可以正常工作了.虽然我“有点”解决了这个问题,但我仍然很困惑为什么会发生这种情况。如果有人知道为什么,我真的很感激答案。谢谢!
【问题讨论】:
-
请显示完整的错误消息,包括哪一行负责。