【问题标题】:Python | Read from multiple csv files and add to a nested list蟒蛇 |从多个 csv 文件中读取并添加到嵌套列表中
【发布时间】:2018-12-31 04:13:40
【问题描述】:

我有 csv 文件(存在于同一目录中),如下所示:

文件1:

Id,Param1,Param2
1,10,12
2,16,18
3,24,28
4,22,26

文件2:

Id,Param1,Param2
1,13,19
2,15,23
3,21,25

我想读取文件并创建这样的嵌套列表:

Param1 = [[10, 16, 24, 22], [13, 15, 21]]
Param2 = [[12, 18, 28, 26], [19, 23, 25]]

我尝试了什么:

for i in range(1,nof+1,1):
    with open("File%i.csv" %i, "rb") as f1:
        reader = csv.reader(f1)

        for row in reader:
            Param1.append(row[1])
            Param2.append(row[2])

最后:

[Param1[i:i + n] for i in range(0, len(Param1), n)]
[Param2[i:i + n] for i in range(0, len(Param2), n)]

如果我的所有文件中的行数相同,那会很好,但事实并非如此。我的文件有不相等的行数。那么,有人可以帮我弄清楚如何创建这些拆分。非常感谢。

【问题讨论】:

  • 您的输入不是 csv 文件。请编辑以显示实际输入,而不是他们的 excel 表示
  • 我不明白。 “实际输入”将是这些相同“excel 格式”数字的逗号分隔值。
  • 完全正确。因此,我们可以重现并解决您的问题,而无需自己编写这些输入。

标签: python list pandas csv nested-lists


【解决方案1】:

你会用熊猫吗?

import pandas as pd

dfs = []
nof = 2
for i in range(1, nof+1, 1):
    dfs.append(pd.read_csv("File{}.csv".format(i)))

param1_list = [list(df['Param1']) for df in dfs]
param2_list = [list(df['Param2']) for df in dfs] 

print(param1_list)
print(param2_list)

try it here

【讨论】:

  • 我试图避免使用 pandas,但我不得不承认,它非常简洁。
【解决方案2】:

稍微修改了样本输入。

cat file1
1|10|12
2|16|18
3|24|28
4|22|26

cat file2
1|13|19
2|15|23
3|21|25

示例代码

def process(filename):
    first_list = []
    second_list = []
    with open(filename, 'r') as fh:
        for line in fh:
            line = line.rstrip()
            dummy, first_field, second_field = line.split('|')
            first_list.append(first_field)
            second_list.append(second_field)

        return [first_list, second_list]

print (process('file1'))
print (process('file2'))

输出

[['10', '16', '24', '22'], ['12', '18', '28', '26']]
[['13', '15', '21'], ['19', '23', '25']]

【讨论】:

    【解决方案3】:

    这是使用字典和csv.reader 的一种方法:

    from io import StringIO
    import csv
    
    file1 = StringIO("""Id Param1 Param2
    1   10     12
    2   16     18
    3   24     28
    4   22     26""")
    
    file2 = StringIO("""Id Param1 Param2
    1   13     19
    2   15     23
    3   21     25""")
    
    res = {}
    for i, file in enumerate([file1, file2]):
        # replace file with open('...', 'r')
        with file as fin:
            reader = csv.reader(file, delimiter=' ', skipinitialspace=True)
            next(reader)  # exclude header row
            res[i] = {idx: list(map(int, x)) for idx, x in enumerate(zip(*reader))}
    
    Param1 = [res[0][1], res[1][1]]
    Param2 = [res[0][2], res[1][2]]
    
    print(Param1, Param2, sep='\n')
    
    [[10, 16, 24, 22], [13, 15, 21]]
    [[12, 18, 28, 26], [19, 23, 25]]
    

    【讨论】:

      【解决方案4】:
      >>> from collections import defaultdict
      ... from csv import DictReader
      ... 
      ... 
      ... def solution(filenames):
      ...     result = defaultdict(list)
      ...     for filename in filenames:
      ...         d = defaultdict(list)
      ...         with open(filename, 'r') as f:
      ...             reader = DictReader(f)
      ...             for line in reader:
      ...                 for k, v in line.items():
      ...                     d[k].append(int(v))
      ... 
      ...         for k, v in d.items():
      ...             result[k].append(v)
      ...     return result
      ... 
      >>> result = solution(['file1.csv', 'file2.csv'])
      >>> result['Param1']
      [[10, 16, 24, 22], [13, 15, 21]]
      >>> result['Param2']
      [[12, 18, 28, 26], [19, 23, 25]]
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2018-10-11
        • 1970-01-01
        • 1970-01-01
        • 2022-11-06
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多