【问题标题】:Convert CSV to JSON while dropping certain columns在删除某些列时将 CSV 转换为 JSON
【发布时间】:2019-09-24 00:22:02
【问题描述】:

我从 OECD 下载了一个关于收入不平等的数据集作为 csv 文件。我只想保留以下数据:LOCATION、TIME、VALUE。

这是 CSV 头部的一部分:

"LOCATION","INDICATOR","SUBJECT","MEASURE","FREQUENCY","TIME","Value","Flag Codes"
"AUS","INCOMEINEQ","GINI","INEQ","A","2014",0.337,
"AUS","INCOMEINEQ","GINI","INEQ","A","2016",0.33,
"AUT","INCOMEINEQ","GINI","INEQ","A","2014",0.274,
"AUT","INCOMEINEQ","GINI","INEQ","A","2015",0.276,
"AUT","INCOMEINEQ","GINI","INEQ","A","2016",0.284,

到目前为止,这是我的转换器代码:

#!/usr/bin/env python

"""Universal CSV to JSON converter with scalability options"""

__author__      = "Tim Verlaan 11669128"

import csv  
import json  

def convert():
    """Convert CSV file to JSON file"""

    # Open the CSV  
    f = open( 'data.csv')  

    # Change each fieldname to the appropriate field name.    
    reader = csv.DictReader( f, fieldnames = ( "LOCATION","INDICATOR","SUBJECT","MEASURE","FREQUENCY","TIME","Value","Flag Codes" ))  

    # skip the header 
    next(reader)

    # Parse the CSV into JSON  
    out = json.dumps( [ row for row in reader ] )  

    # Save the JSON  
    f = open( 'data_oecd.json', 'w')  
    f.write(out)  


if __name__ == "__main__":
    """Separating the function, for scalability purposes"""

    convert()

现在的结果:

[{"LOCATION": "AUS", "INDICATOR": "INCOMEINEQ", "SUBJECT": "GINI", "MEASURE": "INEQ", "FREQUENCY": "A", "TIME": "2014", "Value": "0.337", "Flag Codes": ""}, {"LOCATION": "AUS", "INDICATOR": "INCOMEINEQ", "SUBJECT": "GINI", "MEASURE": "INEQ", "FREQUENCY": "A", "TIME": "2016", "Value": "0.33", "Flag Codes": ""}, {"LOCATION": "AUT", "INDICATOR": "INCOMEINEQ", "SUBJECT": "GINI", "MEASURE": "INEQ", "FREQUENCY": "A", "TIME": "2014", "Value": "0.274", "Flag Codes": ""}, {"LOCATION": "AUT", "INDICATOR": "INCOMEINEQ", "SUBJECT": "GINI", "MEASURE": "INEQ", "FREQUENCY": "A", "TIME": "2015", "Value": "0.276", "Flag Codes": ""}, {"LOCATION": "AUT", "INDICATOR": "INCOMEINEQ", "SUBJECT": "GINI", "MEASURE": "INEQ", "FREQUENCY": "A", "TIME": "2016", "Value": "0.284", "Flag Codes": ""}

想要的结果:

[{"LOCATION": "AUS", "TIME": 2014, "VALUE": 0.337}, {"LOCATION": "AUS", "TIME": 2016, "VALUE": 0.33}

【问题讨论】:

    标签: python json csv merge


    【解决方案1】:

    您可以使用 pandas 并仅选择所需的列

     import pandas as pd
    
     df=pd.read_csv('data.csv')
     df1 =df.loc[:,['LOCATION','TIME','VALUE']]
    

    【讨论】:

      【解决方案2】:

      您可以在列表推导中提取所需的键

      例如:

      import csv
      import json
      
      with open('data.csv') as infile:
          reader = csv.DictReader(infile)
          out = [{"LOCATION": row['LOCATION'],"TIME": row["TIME"], "VALUE": ["Value"]} for row in reader]
      
      with open('data_oecd.json', 'w') as outfile:
          json.dump(out, outfile)                       #Write to JSON.
      

      输出:

      [{'LOCATION': 'AUS', 'TIME': '2014', 'VALUE': ['Value']},
       {'LOCATION': 'AUS', 'TIME': '2016', 'VALUE': ['Value']},
       {'LOCATION': 'AUT', 'TIME': '2014', 'VALUE': ['Value']},
       {'LOCATION': 'AUT', 'TIME': '2015', 'VALUE': ['Value']},
       {'LOCATION': 'AUT', 'TIME': '2016', 'VALUE': ['Value']}]
      

      【讨论】:

      • 如何在我现有的代码中尽可能高效地实现这一点?
      • 我收到此错误Traceback (most recent call last): File "converter.py", line 6, in <module> out = [{"LOCATION": row['LOCATION'],"TIME": row["TIME"], "VALUE": ["Value"]} for row in reader] File "converter.py", line 6, in <listcomp> out = [{"LOCATION": row['LOCATION'],"TIME": row["TIME"], "VALUE": ["Value"]} for row in reader] KeyError: 'LOCATION'
      • 您的 CSV 文件中似乎没有名为 "LOCATION" 的标题。
      【解决方案3】:

      使用 pandas 很容易做到这一点:

      import pandas as pd
      df = pd.read_csv('data.csv')
      df[['LOCATION', 'TIME', 'Value']].to_json(orient='records')
      

      orient='records' 部分很重要,否则它将按列而不是按行分组

      【讨论】:

        猜你喜欢
        • 2022-01-23
        • 1970-01-01
        • 2018-06-25
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2014-10-30
        • 1970-01-01
        相关资源
        最近更新 更多