【问题标题】:python dynamic nested dictionary to csvpython动态嵌套字典到csv
【发布时间】:2021-06-21 08:30:05
【问题描述】:

下面得到的输出来自查询结果。

{'_id': ObjectId('651f3e6e5723b7c1'), 'fruits': {'pineapple': '2', 'grape': '0', 'apple': 'unknown'},'day': 'Tues', 'month': 'July', 'address': 'long', 'buyer': 'B1001', 'seller': 'S1301', 'date': {'date': 210324}}

{'_id': ObjectId('651f3e6e5723b7c1'), 'fruits': {'lemon': '2', 'grape': '0', 'apple': 'unknown', 'strawberry': '1'},'day': 'Mon', 'month': 'January', 'address': 'longer', 'buyer': 'B1001', 'seller': 'S1301', 'date': {'date': 210324}}



#worked but not with fruits and dynamic header

date = json.dumps(q['date'])  #convert it to string  
date = re.split("(:|\}| )", date)[4] #and split to get value
    
for q in db.fruits.aggregate(query):

               print('"' + q['day'] + '","' + q['month'] + '","' + date + '","' + q['time'] + '","' + q['buyer'] + '","' + q['seller'] + '"')

 
               #below close to what I want but having issue with nested and repeated rows

               ffile = open("fruits.csv", "w")
               w = csv.DictWriter(ffile, q.keys())
               w.writeheader()
               w.writerow(q)

我想从中创建一个 csv。

我能够得到与下表完全一样的所有东西,但不是水果。我被困在嵌套字典字段和动态表头中。

Mongoexport 目前不适合我。

字段水果每次可以有更多不同的嵌套键和值。
我目前仍在尝试/探索 csv.writer 并尝试添加条件,如果我发现嵌套的 dict。 [如果我设法创建 csv,将更新答案]
创建此 csv 的 提示 会很高兴。 如果有人分享类似问题的链接,谢谢。

【问题讨论】:

    标签: python json csv dictionary


    【解决方案1】:

    没问题!

    我们需要展平深层结构,以便我们可以从那里所有可能的键形成 CSV。这需要一个递归函数(此处为flatten_dict)来获取输入字典并将其转换为不包含更多字典的输出字典;在这里,键是元组,例如('foo', 'bar', 'baz').

    我们在所有输入行上运行该函数,收集我们在到达known_keys 集的过程中遇到的键。

    该集合已排序(因为我们假设原始字典也没有真正的内在顺序)并且点连接以重新形成 CSV 标题行。

    然后,展平的行被简单地迭代和写入(注意为不存在的值写入一个空字符串)。

    输出例如

    _id,address,buyer,date.date,day,fruits.apple,fruits.grape,fruits.lemon,fruits.pineapple,fruits.strawberry,month,seller
    651f3e6e5723b7c1,long,B1001,210324,Tues,unknown,0,,2,,July,S1301
    651f3e6e5723b7c2,longer,B1001,210324,Mon,unknown,0,2,,1,January,S1301
    
    import csv
    import sys
    
    rows = [
        {
            "_id": "651f3e6e5723b7c1",
            "fruits": {"pineapple": "2", "grape": "0", "apple": "unknown"},
            "day": "Tues",
            "month": "July",
            "address": "long",
            "buyer": "B1001",
            "seller": "S1301",
            "date": {"date": 210324},
        },
        {
            "_id": "651f3e6e5723b7c2",
            "fruits": {
                "lemon": "2",
                "grape": "0",
                "apple": "unknown",
                "strawberry": "1",
            },
            "day": "Mon",
            "month": "January",
            "address": "longer",
            "buyer": "B1001",
            "seller": "S1301",
            "date": {"date": 210324},
        },
    ]
    
    
    def flatten_dict(d: dict) -> dict:
        """
        Flatten hierarchical dicts into a dict of path tuples -> deep values.
        """
        out = {}
    
        def _flatten_into(into, pairs, prefix=()):
            for key, value in pairs:
                p_key = prefix + (key,)
                if isinstance(value, list):
                    _flatten_into(into, enumerate(list), p_key)
                elif isinstance(value, dict):
                    _flatten_into(into, value.items(), p_key)
                else:
                    out[p_key] = value
    
        _flatten_into(out, d.items())
        return out
    
    
    known_keys = set()
    flat_rows = []
    for row in rows:
        flat_row = flatten_dict(row)
        known_keys |= set(flat_row.keys())
        flat_rows.append(flat_row)
    
    ordered_keys = sorted(known_keys)
    writer = csv.writer(sys.stdout)
    writer.writerow([".".join(map(str, key)) for key in ordered_keys])
    for flat_row in flat_rows:
        writer.writerow([str(flat_row.get(key, "")) for key in ordered_keys])
    

    【讨论】:

      猜你喜欢
      • 2017-12-30
      • 1970-01-01
      • 2022-11-25
      • 2018-06-02
      • 1970-01-01
      • 1970-01-01
      • 2018-09-15
      • 2022-01-15
      • 2022-06-11
      相关资源
      最近更新 更多