【问题标题】:Merging values together to one key in an Ordered Dict将值合并到有序字典中的一个键
【发布时间】:2018-03-15 21:27:30
【问题描述】:

所以我想知道是否有比我现在实施的更优雅的解决方案来合并有序 dict 的值。

我有一个看起来像这样的有序字典

'fields': OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
    ("Ethnicity/ Race", "Race"),
]),

如果我传入一个像列表这样的参数

[2,3] or [2,4,5]

有没有一种优雅的方法可以将值合并到一个新键下,所以

[2,3], "Random_Key"

会返回

'fields': OrderedDict([
        ("Sample Code", "Vendor Sample ID"),
        ("Donor ID", "Vendor Subject ID"),
        **("Random Key", "Material Format Sample Type"),**
        ("Age", "Age"),
        ("Gender", "Gender"),
        ("Ethnicity/ Race", "Race"),
    ]),

同时删除字典中的键?

【问题讨论】:

  • 至少是 2018 年关于字典的一个有趣问题。我希望您将输入数据再减少一点。太多的值淹没了开始和结束字典之间的差异。
  • @Jean-FrançoisFabre 我能做到!
  • 看:这样看起来更好

标签: python python-3.x ordereddictionary


【解决方案1】:

这也可以用生成器很好地完成。

如果不需要压缩,则此生成器会生成密钥项对,如果已压缩,则将项保存到最后一个条目,然后生成它,并使用新密钥并将保存的项连接起来。

使用生成器可以构造一个新的 OrderedDict。

from collections import OrderedDict    

def sqaushDict(d, ind, new_key):
    """ Takes an OrderedDictionary d, and yields its key item pairs, 
    except the ones at an index in indices (ind), these items are merged 
    and yielded at the last position of indices (ind) with a new key (new_key)
    """
    if not all(x < len(d) for x in ind):
        raise IndexError ("Index out of bounds")
    vals = []
    for n, (k, i), in enumerate(d.items()):
        if n in ind:
            vals += [i]
            if n == ind[-1]:
                yield (new_key, " ".join(vals))
        else:
            yield (i, k)

d = OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
])

t = OrderedDict(squashDict(d, [2, 3], "Random"))
print(t)

【讨论】:

  • 这绝对是我想要的,谢谢!
【解决方案2】:

不确定是否有优雅的方式。 OrderedDict 有一个 move_to_end 方法可以在开始或结束时移动键,但不能在随机位置移动。

我会尽量提高效率,并尽量减少循环

  • 获取键列表
  • 找到要与以下的键合并的索引
  • 删除字典的下一个键
  • 使用d 项创建列表
  • 使用存储索引处的新值更改此列表
  • 从中重建OrderedDict

像这样(我删除了一些键,因为它缩短了示例):

from collections import OrderedDict

d = OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
])

lk = list(d.keys())
index = lk.index("Sample Type")
v = d.pop(lk[index+1])

t = list(d.items())
t[index] = ("new key",t[index][1]+" "+v)

d = OrderedDict(t)

print(d)

结果:

OrderedDict([('Sample Code', 'Vendor Sample ID'), ('Donor ID', 'Vendor Subject ID'), ('Format', 'Material Format'), ('new key', '样本类型 Age'), ('Gender', 'Gender')])

【讨论】:

    【解决方案3】:

    您可以通过对索引进行降序排序来优化这一点,然后您可以使用dict.pop(key,None) 一次检索并删除键/值,但我决定不这样做,按照indices 中出现的顺序附加值。

    from collections import OrderedDict
    from pprint import pprint
    
    def mergeEm(d,indices,key):
        """Merges the values at index given by 'indices' on OrderedDict d into a list.        
        Appends this list with key as key to the dict. Deletes keys used to build list."""
    
        if not all(x < len(d) for x in indices):
            raise IndexError ("Index out of bounds")
    
        vals = []                      # stores the values to be removed in order
        allkeys = list(d.keys())
        for i in indices:
            vals.append(d[allkeys[i]])   # append to temporary list
        d[key] = vals                  # add to dict, use ''.join(vals) to combine str
        for i in indices:              # remove all indices keys
            d.pop(allkeys[i],None)
        pprint(d)
    
    
    fields= OrderedDict([
        ("Sample Code", "Vendor Sample ID"),
        ("Donor ID", "Vendor Subject ID"),
        ("Format", "Material Format"),
        ("Sample Type", "Sample Type"),
        ("Age", "Age"),
        ("Gender", "Gender"),
        ("Ethnicity/ Race", "Race"),
        ("Sample Type", "Sample Type"),
        ("Organ", "Organ"),
        ("Pathological Diagnosis", "Diagnosis"),
        ("Detailed Pathological Diagnosis", "Detailed Diagnosis"),
        ("Clinical Diagnosis/Cause of Death", "Detailed Diagnosis option 2"),
        ("Dissection", "Dissection"),
        ("Quantity (g, ml, or ug)", "Quantity"),
        ("HIV", "HIV"),
        ("HEP B", "HEP B")
    ])
    pprint(fields)
    mergeEm(fields, [5,4,2], "tata")
    

    输出:

    OrderedDict([('Sample Code', 'Vendor Sample ID'),
                 ('Donor ID', 'Vendor Subject ID'),
                 ('Format', 'Material Format'),
                 ('Sample Type', 'Sample Type'),
                 ('Age', 'Age'),
                 ('Gender', 'Gender'),
                 ('Ethnicity/ Race', 'Race'),
                 ('Organ', 'Organ'),
                 ('Pathological Diagnosis', 'Diagnosis'),
                 ('Detailed Pathological Diagnosis', 'Detailed Diagnosis'),
                 ('Clinical Diagnosis/Cause of Death',
                  'Detailed Diagnosis option 2'),
                 ('Dissection', 'Dissection'),
                 ('Quantity (g, ml, or ug)', 'Quantity'),
                 ('HIV', 'HIV'),
                 ('HEP B', 'HEP B')])
    
    
    OrderedDict([('Sample Code', 'Vendor Sample ID'),
                 ('Donor ID', 'Vendor Subject ID'),
                 ('Sample Type', 'Sample Type'),
                 ('Ethnicity/ Race', 'Race'),
                 ('Organ', 'Organ'),
                 ('Pathological Diagnosis', 'Diagnosis'),
                 ('Detailed Pathological Diagnosis', 'Detailed Diagnosis'),
                 ('Clinical Diagnosis/Cause of Death',
                  'Detailed Diagnosis option 2'),
                 ('Dissection', 'Dissection'),
                 ('Quantity (g, ml, or ug)', 'Quantity'),
                 ('HIV', 'HIV'),
                 ('HEP B', 'HEP B'),
                 ('tata', ['Gender', 'Age', 'Material Format'])])
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-09-18
      • 1970-01-01
      • 2017-01-04
      • 1970-01-01
      • 2021-03-30
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多