【问题标题】:csv file to list within dictionary在字典中列出的 csv 文件
【发布时间】:2020-09-26 02:43:29
【问题描述】:
with open('exoplanets.csv') as infile:
    planets = {} 
    lines = infile.readline()
    for line in infile:
        reader = csv.reader(infile)
        number =  [line]

        methods, number, orbital_period, mass, distance, year = (s.strip(' ') for s in line.split(','))
        planets[methods] = (number, orbital_period, mass, distance, year)
        print(planets)

我的代码目前与示例输入类似:

我的输出如下所示:

但是,我希望它看起来像这样:

{
  "Radial Velocity" : {"number":[1,1,1], "orbital_period":[269.3, 874.774, 763.0], "mass":[7.1, 2.21, 2.6], "distance":[77.4, 56.95, 19.84], "year":[2006.0, 2008.0, 2011.0] } , 
  "Transit" : {"number":[1,1,1], "orbital_period":[1.5089557, 1.7429935, 4.2568], "mass":[], "distance":[200.0, 680.0], "year":[2008.0, 2008.0, 2008.0] }
}

谁能帮帮我

【问题讨论】:

  • 字典项中的列表是否为 "orbital_period":[269.3, 874.774, 763.0] 曾经有 3 个元素长?
  • 我在您想要的输出的第一行中理解了这本字典:"Radial Velocity" : {"number":[1,1,1], "orbital_period":[269.3, 874.774, 763.0], "mass":[7.1, 2.21, 2.6], "distance":[77.4, 56.95, 19.84], "year":[2006.0, 2008.0, 2011.0] }。但是第二行"Transit" : {"number":[1,1,1], "orbital_period":[1.5089557, 1.7429935, 4.2568], "mass":[], "distance":[200.0, 680.0], "year":[2008.0, 2008.0, 2008.0] } 的数字来自哪里?在我看来,它们不在输入文件中......
  • 它最初是一个 .csv 文件,但我只是在谷歌表格中格式化以便于访问。作业不希望我既不使用 panda 也不使用 csv 模块
  • 您可以从电子表格中复制粘贴

标签: python python-3.x list csv dictionary


【解决方案1】:

检查此代码:

# import nan
from math import nan

# define source file
filename = 'EXOPLANETS.CSV - Sheet1.csv'

# read source file
with open(filename, 'r') as file:
    data = file.readlines()

# prepare output dictionary
output = {}

# read line by line
for idx, line in enumerate(data, 0):

    # split columns
    items = line.replace('\n', '').split(',')

    # extract inner dictionary's keys in a list: 'number', orbital_period', 'mass', 'distance', 'year'
    if idx == 0:
        values = [key for key in items[1:]]

    else:

        # add main key to the output dictionary: 'Radial Velocity', 'Imaging', 'Transit'
        if items[0] not in output.keys():
            output[items[0]] = {key : [] for key in values}

        # add value to the inner dictionary
        for jdx, key in enumerate(values, 1):

            # if the value is a valid number, convert it in float
            if items[jdx] != '':
                output[items[0]][key].append(float(items[jdx]))

            # if the value is not a valid number (empty cell), add a 'nan'
            else:
                output[items[0]][key].append(nan)

for items in output.items():
    print(items)

它会在不使用pandascsv 的情况下执行您的任务:

("Radial Velocity" : {"number":[1.0, 1.0, ...], "orbital_period":[269.3, 874.774, ...], "mass":[7.1, 2.21, ...], "distance":[77.4, 56.95, ...], "year":[2006.0, 2008.0, ...] ),

("Imaging" : {"number":[1.0, 1.0, ...], "orbital_period":[nan, nan, ...], "mass":[nan, nan, ...], "distance":[45.52, 165.0, ...], "year":[2005.0, 2007.0, ...] ),

("Transit" : {"number":[1.0, 1.0, ...], "orbital_period":[1.5089557, 1.7429935, ...], "mass":[nan, nan, ...], "distance":[nan, 200.0, ...], "year":[2008.0, 2008.0, ...] })

如果源数据中的值是一个空单元格,上面的代码将在output 中添加一个nan。如果这是不受欢迎的行为并且您想跳过空单元格,请使用以下代码:

# define source file
filename = 'EXOPLANETS.CSV - Sheet1.csv'

# read source file
with open(filename, 'r') as file:
    data = file.readlines()

# prepare output dictionary
output = {}

# read line by line
for idx, line in enumerate(data, 0):

    # split columns
    items = line.replace('\n', '').split(',')

    # extract inner dictionary's keys in a list: 'number', orbital_period', 'mass', 'distance', 'year'
    if idx == 0:
        values = [key for key in items[1:]]

    else:

        # add main key to the output dictionary: 'Radial Velocity', 'Imaging', 'Transit'
        if items[0] not in output.keys():
            output[items[0]] = {key : [] for key in values}

        # add value to the inner dictionary
        for jdx, key in enumerate(values, 1):

            # if the value is a valid number, convert it in float
            if items[jdx] != '':
                output[items[0]][key].append(float(items[jdx]))

for items in output.items():
    print(items)

【讨论】:

  • 谢谢!非常感谢您的帮助!
  • 输出后面的items[0],是逗号,怎么变成分号
  • 我认为您应该为此打开另一个问题以保持网站清洁
猜你喜欢
  • 1970-01-01
  • 2021-07-01
  • 2018-11-06
  • 2015-08-14
  • 2017-09-15
  • 2023-04-10
  • 1970-01-01
  • 1970-01-01
  • 2015-07-06
相关资源
最近更新 更多