【发布时间】:2016-08-07 22:06:25
【问题描述】:
我将“全名”字段从 excel 文件的数据中拆分为“名字”、中间名和“姓氏”字段。我不知道如何在 pandas 中做到这一点,所以我转向openpyxl。我根据需要拆分了变量。但是,since adding columns to openpyxl for the new fields is not easy,我想我会将值传递给 pandas。
当我运行代码时,我正在生成我需要的数据框,但是一旦我将 df 发送到 ExcelWriter,只有最后一行被添加到 Excel 文件中。不过,数据位于正确的位置。
代码如下:
for cellObj in range(2, sheet.max_row+1):
#print cellObj
id = sheet['A' + str(cellObj)].value
fullname = sheet['B' + str(cellObj)].value.strip()
namelist = fullname.split(' ')
for i in namelist:
firstname = namelist[0]
if len(namelist) == 2:
lastname = namelist[1]
middlename = ''
elif len(namelist) == 3:
middlename = namelist[1]
lastname = namelist[2]
elif len(namelist) == 4:
middlename = namelist[1]
lastname = namelist[2] + " " + namelist[3]
if (namelist[1] == 'Del') | (namelist[1] == 'El') | (namelist[1] == 'Van'):
middlename = ''
lastname = namelist[1] + " " + namelist[2]
df = pd.DataFrame({'personID':id,'lastName':lastname,'firstName':firstname,'middleName':middlename}, index=[id])
writer = pd.ExcelWriter('output.xlsx')
df.to_excel(writer,'Sheet1', columns=['ID','lastName','firstName','middleName'])
writer.save()
有什么想法吗?
谢谢
【问题讨论】:
标签: python excel pandas openpyxl