Python 读取循环的唯一 CSV 值答案

【问题标题】：Python Read unique CSV Values for LoopPython 读取循环的唯一 CSV 值
【发布时间】：2018-02-10 05:35:37
【问题描述】：

我正在生成 3 个 CSV，我希望将它们合并为一个。我只需要每个文件中的某些列，但我需要它们在开关号和界面上匹配

文件1

switch1,Gi1/0/22,connected,716,a-full,a-100,10/100/1000BaseTX
switch2,Fa3/0/8,connected,716,a-full,a-100,10/100BaseTX
switch3,Fa2/0/5,connected,716,a-full,a-100,10/100BaseTX

文件2

switch1,716,0040.0020.0010,DYNAMIC,Gi1/0/22
switch2,716,0030.0020.1010,DYNAMIC,Fa3/0/8
switch3,716,0050.0030.1010,DYNAMIC,Fa2/0/5

文件3

switch1,Gi1/0/22,0,32,0,33,0,9
switch2,Fa3/0/8,0,0,0,0,0,362
switch3,Fa2/0/5,0,10,20,0,0,100

我试图让最终的 csv 看起来像这样：

switch1,Gi1/0/22,0040.0020.0010,0,32,0,33,0,9
switch2,Fa3/0/8,0030.0020.1010,0,0,0,0,0,362
switch3,Fa2/0/5,0050.0030.1010,0,10,20,0,0,100

是开关名称，接口，File2的第3列，File3的第3-8列

如果您不想给出确切的答案，则不要寻找确切的答案，而更多的是一般的想法/方向。对 python 来说还是很新的。

【问题讨论】：

你可能想看看熊猫。您可以将每个文件作为单独的数据框读取，将它们组合起来并编写最终的 csv。 read_csv、merge 和 to_csv 是您需要的方法。

标签： python csv

【解决方案1】：

您可以使用 pandas 或标准库来执行此操作。 Pandas 通常更快更容易阅读。

设置：

from textwrap import dedent

def write_file(name, string):
    with open(name, 'w') as f:
        f.write(dedent(string).lstrip())

write_file('File1.csv', """
    switch1,Gi1/0/22,connected,716,a-full,a-100,10/100/1000BaseTX
    switch2,Fa3/0/8,connected,716,a-full,a-100,10/100BaseTX
    switch3,Fa2/0/5,connected,716,a-full,a-100,10/100BaseTX
""")

write_file('File2.csv', """
    switch1,716,0040.0020.0010,DYNAMIC,Gi1/0/22
    switch2,716,0030.0020.1010,DYNAMIC,Fa3/0/8
    switch3,716,0050.0030.1010,DYNAMIC,Fa2/0/5
""")

write_file('File3.csv', """
    switch1,Gi1/0/22,0,32,0,33,0,9
    switch2,Fa3/0/8,0,0,0,0,0,362
    switch3,Fa2/0/5,0,10,20,0,0,100
""")

熊猫代码：

import pandas as pd

t1 = pd.read_csv('File1.csv', names=['switch_name', 'interface', 'col3', 'col4', 'col5', 'col6', 'col7'])
t2 = pd.read_csv('File2.csv', names=['switch_name', 'col2', 'col3', 'col4', 'interface'])
t3 = pd.read_csv('File3.csv', names=['switch_name', 'interface', 'col3', 'col4', 'col5', 'col6', 'col7', 'col8'])

result = t2[['switch_name', 'interface', 'col3']].merge(t3, on=['switch_name', 'interface'])
result.to_csv('Final.csv', header=False, index=False)

with open('Final.csv') as f:
    print f.read()

# switch1,Gi1/0/22,0040.0020.0010,0,32,0,33,0,9
# switch2,Fa3/0/8,0030.0020.1010,0,0,0,0,0,362
# switch3,Fa2/0/5,0050.0030.1010,0,10,20,0,0,100

标准库代码：

import csv

# store data in a dictionary for later reference
with open('File3.csv') as f:
    f3_data = {(r[0], r[1]): r[2:8] for r in csv.reader(f)}

with open('File2.csv') as f2, open('Final.csv', 'w') as f:
    final = csv.writer(f)
    for switch_name, col2, col3, col4, interface in csv.reader(f2):
        if (switch_name, interface) in f3_data:
            final.writerow([switch_name, interface, col3] + f3_data[switch_name, interface])

with open('Final.csv') as f:
    print f.read()

# switch1,Gi1/0/22,0040.0020.0010,0,32,0,33,0,9
# switch2,Fa3/0/8,0030.0020.1010,0,0,0,0,0,362
# switch3,Fa2/0/5,0050.0030.1010,0,10,20,0,0,100

【讨论】：

【解决方案2】：

您可以先一次打开 3 个文件，使用 csv 库将它们读入嵌套的行列表，然后提取您需要的列并将它们写入文件：

from csv import reader

# open all files at once
with open('file1.csv') as f1, \
     open('file2.csv') as f2, \
     open('file3.csv') as f3:

     # convert them to reader objects
     csv_files = reader(f1), reader(f2), reader(f3)

     # open file to write to
     with open('combined.csv', 'w') as out:

         # go over each row from the files at once using zip()
         for row1, row2, row3 in zip(*csv_files):

             # extract columns into a list
             line = row1[:2] +[row2[2]] +  row3[3:]

             # write to the file
             out.write(','.join(line) +'\n')

# print contents of new file
print(open('combined.csv').read())

哪些输出：

switch1,Gi1/0/22,0040.0020.0010,0,32,0,33,0,9
switch2,Fa3/0/8,0030.0020.1010,0,0,0,0,0,362
switch3,Fa2/0/5,0050.0030.1010,0,10,20,0,0,100

【讨论】：

假设所有csvs 的顺序相同，这将起作用。但是，如果任何csv 的顺序不同，就会出现不匹配。
@Idlehands 是的。 OP确实没有指定这一点。这应该为解决问题打下良好的基础。