使用 Python 将多个列从 csv 文件复制到现有的 xls 文件答案

【问题标题】：Copying several columns from a csv file to an existing xls file using Python使用 Python 将多个列从 csv 文件复制到现有的 xls 文件
【发布时间】：2016-11-30 14:56:43
【问题描述】：

我对 Python 还是很陌生，但我在开始时遇到了一些困难。我正在使用 Python 3。

我在谷歌上搜索并找到了很多有助于解决此问题的 python 模块，但希望在这里得到更明确的答案。所以基本上，我需要从 csv 文件中读取某些列，即 G、H、I、K 和 M。我需要的不是连续的。

我需要从 csv 文件中读取这些列，并将它们传输到现有 xls 中的空列，其中已经包含数据。

我查看了 openpyxl，但它似乎不适用于 csv/xls 文件，只能用于 xlsx。我可以使用 xlwt 模块吗？

任何关于哪个模块最适合我的用例的指导将不胜感激。同时，我将修改 xlwt/xlrd。

【问题讨论】：

xlwt/xlrd 我用过。他们似乎对我很好。 Python有一个csvreader
我忘了说。 csv 文件中的每一列都有大约 9k 个条目...

标签： python excel python-3.x csv

【解决方案1】：

我建议使用熊猫。具有方便的读取和写入csv和xls文件的功能。

import pandas as pd
from openpyxl import load_workbook

#read the csv file
df_1 = pd.read_csv('c:/test/test.csv')

#lets say df_1 has columns colA and colB
print(df_1)

#read the xls(x) file
df_2=pd.read_excel('c:/test/test.xlsx')
#lets say df_2 has columns aa and bb

#now add a column from df_1 to df_2
df_2['colA']=df_1['colA']

#save the combined output
writer = pd.ExcelWriter('c:/test/combined.xlsx')
df_2.to_excel(writer)
writer.save()

#alternatively, if you want to add just one column to an existing xlsx file:

#i.e. get colA from df_1 into a new dataframe
df_3=pd.DataFrame(df_1['colA'])


#create writer using openpyxl engine
writer = pd.ExcelWriter('c:/test/combined.xlsx', engine='openpyxl') 

#need this workaround to provide a list of work sheets in the file
book = load_workbook('c:/test/combined.xlsx')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

column_to_write=16 #this would go to column Q (zero based index)
writeRowIndex=0 #don't plot row index
sheetName='Sheet1' #which sheet to write on

#now write the single column df_3 to the file
df_3.to_excel(writer, sheet_name=sheetName, columns =['colA'],startcol=column_to_write,index=writeRowIndex)

writer.save()

【讨论】：

哇！我一定会试一试的！那么 panda 模块中有一些方法可以选择 csv 文件中的特定列并存储在 xls 文件中的特定列中吗？编辑：这些文件中也有大量数据，比如每列有 10k 个条目。
我建议同时阅读 csv 和 xls 文件，合并您想要的列并将所有内容保存到一个新文件中。有这种可能吗？
是的，我也可以。将不得不研究如何组合列。条目数量会导致问题吗？
我什么都懂，除了 df_2.to_excel(writer, 'Scores 1')。这是什么意思？
该行将 df_2 的内容（现在用新列扩展）保存到链接到输出 xlsx 文件的 writer 对象。我删除了'Scores 1'，这是另一个例子的剩余部分。该字符串指定要保存在哪个工作表中。如果有帮助，请接受答案

【解决方案2】：

您可以尝试 XlsxWriter ，这是一个功能齐全的 Python 模块，用于编写 Excel 2007+ XLSX 文件格式。 https://pypi.python.org/pypi/XlsxWriter

【讨论】：