【问题标题】:Create a JSON/text file based on contents of a csv file根据 csv 文件的内容创建 JSON/文本文件
【发布时间】:2018-01-22 04:47:12
【问题描述】:

我正在尝试遍历一个 csv 文件(大约 9100 万条记录),并根据下面的示例记录使用 Python dict 创建一个新的 json/文本文件(文件按 id、类型排序)。

id,type,value
4678367,1,1001
4678367,2,1007
4678367,2,1008
5678945,1,9000
5678945,2,8000

代码应在匹配 id 和类型时附加值,否则创建一个新记录,如下所示。我想将其写入目标文件

如何在 Python 中做到这一点?

{'id':4678367,
 'id_1':[1001],
 'id_2':[1007,1008]
},
{'id':5678945,
 'id_1':[9000],
 'id_2':[8000]
}

【问题讨论】:

    标签: python json python-3.x csv


    【解决方案1】:

    这是收集物品的一种方法。我已将写入文件作为练习:

    代码:

    with open('test.csv') as f:
        reader = csv.reader(f)
        columns = next(reader)
        results = []
        record = {}
        current_type = 0
        items = []
        for id_, type, value in reader:
            if current_type != type:
                if current_type:
                    record['id_{}'.format(current_type)] = items
                    items = []
                current_type = type
    
            if id_ != record.get('id'):
                if record:
                    results.append(record)
                record = dict(id=id_)
    
            items.append(value)
    
        if record:
            record['id_{}'.format(current_type)] = items
            results.append(record)
    
    print(results)
    

    结果:

    [
        {'id': '4678367', 'id_1': ['1001'], 'id_2': ['1007', '1008']}, 
        {'id': '5678945', 'id_1': ['9000'], 'id_2': ['8000']}
    ]
    

    【讨论】:

      【解决方案2】:
      import csv
      from collections import namedtuple
      
      with open("data.csv","r") as f:
          read = csv.reader(f)
          header = next(read)
          col = namedtuple('col',header)
          dictionary = {}
          for values in read:
              data = col(*values)
              type_ = 'id_' + str(data.type)
              if data.id in dictionary:
                  local_dict = dictionary[data.id]                
                  if type_ in local_dict:
                      local_dict[type_].append(data.value)
                  else:
                      local_dict[type_] = [data.value]
              else:
                  dictionary.setdefault(data.id,{'id':data.id,type_:[data.value]})
      print(*dictionary.values(),sep="\n")
      >>>{'id': '4678367', 'id_1': ['1001'], 'id_2': ['1007', '1008']}
         {'id': '5678945', 'id_1': ['9000'], 'id_2': ['8000']}
      

      【讨论】:

        猜你喜欢
        • 2021-12-17
        • 1970-01-01
        • 1970-01-01
        • 2017-04-27
        • 1970-01-01
        • 2016-10-15
        • 2023-03-15
        • 2021-08-10
        • 1970-01-01
        相关资源
        最近更新 更多