【问题标题】:pandas dataframe concat is giving unwanted NA/NaN columns熊猫数据框 concat 给出了不需要的 NA/NaN 列
【发布时间】:2014-05-25 13:27:31
【问题描述】:

这个例子不是水平的After Pandas Dataframe pd.concat I get NaNs,而是垂直的:

import pandas
a=[['Date', 'letters', 'numbers', 'mixed'], ['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3']]
df = pandas.DataFrame.from_records(a[1:],columns=a[0])

f=[]
for i in range(0,len(df)):
    f.append(df['Date'][i] + ' ' + df['letters'][i])

df['new']=f

c=[x for x in range(0,5)]
b=[]
b += [['NA'] * (5 - len(b))]
df_a = pandas.DataFrame.from_records(b,columns=c)

df_b=pandas.concat([df,df_a], ignore_index=True)

df_b 输出与df_b=pandas.concat([df,df_a], axis=0) 相同

结果:

     0    1    2    3    4      Date letters mixed         new numbers
0  NaN  NaN  NaN  NaN  NaN  1/2/2014       a    z1  1/2/2014 a       6
1  NaN  NaN  NaN  NaN  NaN  1/2/2014       a    z1  1/2/2014 a       3
2  NaN  NaN  NaN  NaN  NaN  1/3/2014       c    x3  1/3/2014 c       1
0   NA   NA   NA   NA   NA       NaN     NaN   NaN         NaN     NaN

想要的:

       Date letters numbers mixed         new
0  1/2/2014       a       6    z1  1/2/2014 a
1  1/2/2014       a       3    z1  1/2/2014 a
2  1/3/2014       c       1    x3  1/3/2014 c
0  NA             NA      NA   NA  NA

【问题讨论】:

    标签: python pandas dataframe concat na


    【解决方案1】:

    我会直接创建一个包含正确列的数据框df_a

    对你的代码进行一点重构,它给出了

    import pandas
    a=[['Date', 'letters', 'numbers', 'mixed'], \
       ['1/2/2014', 'a', '6', 'z1'],\
       ['1/2/2014', 'a', '3', 'z1'],\
       ['1/3/2014', 'c', '1', 'x3']]
    df = pandas.DataFrame.from_records(a[1:],columns=a[0])
    df['new'] = df['Date'] + ' ' + df['letters']
    
    n = len(df.columns)
    b = [['NA'] * n]
    df_a = pandas.DataFrame.from_records(b,columns=df.columns)
    df_b = pandas.concat([df,df_a])
    

    它给了

           Date letters numbers mixed         new
    0  1/2/2014       a       6    z1  1/2/2014 a
    1  1/2/2014       a       3    z1  1/2/2014 a
    2  1/3/2014       c       1    x3  1/3/2014 c
    0        NA      NA      NA    NA          NA
    

    最终:

    df_b = pandas.concat([df,df_a]).reset_index(drop=True)
    

    它给了

           Date letters numbers mixed         new
    0  1/2/2014       a       6    z1  1/2/2014 a
    1  1/2/2014       a       3    z1  1/2/2014 a
    2  1/3/2014       c       1    x3  1/3/2014 c
    3        NA      NA      NA    NA          NA
    

    【讨论】:

      【解决方案2】:

      如果您使用的是最新版本,这将为您提供所需的内容

      df.ix[len(df), :]='NA'
      

      编辑: 或者如果你想要concat,当你定义df_a时,使用df的列作为列

      df_a = pandas.DataFrame.from_records(b,columns=df.columns)
      

      【讨论】:

      • 另外,当您获得df['new'] 时,还有一种更简单的方法:df['new']= df['Date']+' '+df['letters']
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-02-08
      • 1970-01-01
      • 1970-01-01
      • 2017-03-31
      • 1970-01-01
      相关资源
      最近更新 更多