【发布时间】:2018-07-29 22:28:30
【问题描述】:
我有两个数据帧,一个是all_df,另一个是df_good_sample,当我单独使用one_hot_encoding时,一切都很好。但是当我合并这两个数据框时,会发生错误。
我对 one_hot_encoding 的实现是:
def one_hot_encoding(register_info, fea):
flag = True
fea_g_id = 1
if flag:
X_df = pd.get_dummies(register_info[fea])
fea_group_ids = [fea_g_id for i in range(X_df.shape[1])]
flag = False
fea_g_id = fea_g_id + 1
else:
X_cur = pd.get_dummies(register_info[fea])
fea_group_ids += [fea_g_id for i in range(X_cur.shape[1])]
fea_g_id = fea_g_id + 1
X_df = pd.concat([X_df,X_cur],axis=1)
X = X_df.values
return X, X_df
当我将它用于 all_df 时,我得到了 one_hot_encoding result for all_df
同样适用于 df_good_sample
但是当我使用它们的组合时,我得到了:
NotImplementedError: > 1 ndim Categorical 目前不支持
详细的错误信息:
NotImplementedError Traceback (most recent call last)
<ipython-input-325-54e4d184cdb1> in <module>()
2 record_column_length = []
3 for i in range(0, len(category_feature)):
----> 4 category_df[i] = one_hot_encoding(all_df.append(df_good_sample).replace(0, np.nan), category_feature[i])[1]
5 record_column_length.append(len(category_df[i].columns))
6 concat_group = pd.concat(category_df, ignore_index=True, axis=1)
<ipython-input-312-82e782b3856b> in one_hot_encoding(register_info, fea)
16 # print fea
17 if flag:
---> 18 X_df = pd.get_dummies(register_info[fea])
19 fea_group_ids = [fea_g_id for i in range(X_df.shape[1])]
20 flag = False
/home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/reshape/reshape.pyc in get_dummies(data, prefix, prefix_sep, dummy_na, columns, sparse, drop_first)
1211 else:
1212 result = _get_dummies_1d(data, prefix, prefix_sep, dummy_na,
-> 1213 sparse=sparse, drop_first=drop_first)
1214 return result
1215
/home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/reshape/reshape.pyc in _get_dummies_1d(data, prefix, prefix_sep, dummy_na, sparse, drop_first)
1218 sparse=False, drop_first=False):
1219 # Series avoids inconsistent NaN handling
-> 1220 codes, levels = _factorize_from_iterable(Series(data))
1221
1222 def get_empty_Frame(data, sparse):
/home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/categorical.pyc in _factorize_from_iterable(values)
2142 codes = values.codes
2143 else:
-> 2144 cat = Categorical(values, ordered=True)
2145 categories = cat.categories
2146 codes = cat.codes
/home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/categorical.pyc in __init__(self, values, categories, ordered, fastpath)
294
295 # FIXME
--> 296 raise NotImplementedError("> 1 ndim Categorical are not "
297 "supported at this time")
298
NotImplementedError: > 1 ndim Categorical are not supported at this time
希望有人能帮我解决这个问题!!!
【问题讨论】: