【问题标题】:NotImplementedError: > 1 ndim Categorical are not supported at this timeNotImplementedError: > 1 ndim Categorical 目前不支持
【发布时间】:2018-07-29 22:28:30
【问题描述】:

我有两个数据帧,一个是all_df,另一个是df_good_sample,当我单独使用one_hot_encoding时,一切都很好。但是当我合并这两个数据框时,会发生错误。

我对 one_hot_encoding 的实现是:

def one_hot_encoding(register_info, fea):
    flag = True
    fea_g_id = 1
    if flag:
        X_df = pd.get_dummies(register_info[fea])
        fea_group_ids = [fea_g_id for i in range(X_df.shape[1])]
        flag = False
        fea_g_id = fea_g_id + 1
    else:
        X_cur = pd.get_dummies(register_info[fea])
        fea_group_ids += [fea_g_id for i in range(X_cur.shape[1])]
        fea_g_id = fea_g_id + 1
        X_df = pd.concat([X_df,X_cur],axis=1)
    X = X_df.values
return X, X_df

当我将它用于 all_df 时,我得到了 one_hot_encoding result for all_df

同样适用于 df_good_sample

但是当我使用它们的组合时,我得到了:

NotImplementedError: > 1 ndim Categorical 目前不支持

详细的错误信息:

    NotImplementedError  Traceback (most recent call last)
    <ipython-input-325-54e4d184cdb1> in <module>()
          2 record_column_length = []
          3 for i in range(0, len(category_feature)):
    ----> 4     category_df[i] = one_hot_encoding(all_df.append(df_good_sample).replace(0, np.nan), category_feature[i])[1]
          5     record_column_length.append(len(category_df[i].columns))
          6 concat_group = pd.concat(category_df, ignore_index=True, axis=1)

    <ipython-input-312-82e782b3856b> in one_hot_encoding(register_info, fea)
         16 #     print fea
         17     if flag:
    ---> 18         X_df = pd.get_dummies(register_info[fea])
         19         fea_group_ids = [fea_g_id for i in range(X_df.shape[1])]
         20         flag = False

    /home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/reshape/reshape.pyc in get_dummies(data, prefix, prefix_sep, dummy_na, columns, sparse, drop_first)
       1211     else:
       1212         result = _get_dummies_1d(data, prefix, prefix_sep, dummy_na,
    -> 1213                                  sparse=sparse, drop_first=drop_first)
       1214     return result
       1215 

    /home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/reshape/reshape.pyc in _get_dummies_1d(data, prefix, prefix_sep, dummy_na, sparse, drop_first)
       1218                     sparse=False, drop_first=False):
       1219     # Series avoids inconsistent NaN handling
    -> 1220     codes, levels = _factorize_from_iterable(Series(data))
       1221 
       1222     def get_empty_Frame(data, sparse):

    /home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/categorical.pyc in _factorize_from_iterable(values)
       2142         codes = values.codes
       2143     else:
    -> 2144         cat = Categorical(values, ordered=True)
       2145         categories = cat.categories
       2146         codes = cat.codes

    /home/ubuntu/app/anaconda2/lib/python2.7/site-packages/pandas/core/categorical.pyc in __init__(self, values, categories, ordered, fastpath)
        294 
        295                 # FIXME
    --> 296                 raise NotImplementedError("> 1 ndim Categorical are not "
        297                                           "supported at this time")
        298 

    NotImplementedError: > 1 ndim Categorical are not supported at this time

希望有人能帮我解决这个问题!!!

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    我在尝试在名称中或记录的实际值中包含 Unicode 字符的列上获取假人时收到此错误。 我切换了列名和它们的值,它解决了这个问题:

    import pandas as pd
    #replace the column names with 'col1', 'col2' and so forth
    colnum=1
    for colname in list(df):
        df.rename(columns={'' + colname + '': 'col' + str(colnum)}, inplace=True)
        colnum+=1
    
    #replace the column values with 'val1', 'val2' and so forth:
    for colname in list(df):
        f_values= df[colname].unique().tolist()
        mapping = dict(zip(f_values,  ['val' + str(i) for i in range(len(f_values))] ))
        df.replace({'' + colname + '': mapping}, inplace=True)
    
    #now running get_dummies will work
    df = pd.get_dummies(df)
    

    【讨论】:

      猜你喜欢
      • 2017-08-18
      • 2021-12-08
      • 2022-01-26
      • 1970-01-01
      • 2016-06-13
      • 2016-08-01
      • 1970-01-01
      • 2022-12-09
      • 2013-05-01
      相关资源
      最近更新 更多