【问题标题】:Store the values for each key as an array in a dictionary将每个键的值作为数组存储在字典中
【发布时间】:2019-08-12 05:23:20
【问题描述】:

我想规范化字典 data 中的所有值,并将它们再次存储在具有相同键的另一个字典中,并且对于每个键,值应该存储在一维数组中,所以我做了以下操作:

>>> data = {1: [0.6065306597126334], 2: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 3: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 4: [0.6065306597126334, 0.6065306597126334]}

>>> norm = {k: [v / sum(vals) for v in vals] for k, vals in data.items()} 

>>> norm
{1: [1], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}

现在假设字典 data 仅包含其中一个键的零值,例如第一个键 1 的值:

>>> data = {1: [0.0], 2: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 3: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 4: [0.6065306597126334, 0.6065306597126334]}

那么规范化这个字典的值将得到 [nan] 值,因为除以零

>>> norm = {k: [v / sum(vals) for v in vals] for k, vals in data.items()}

__main__:1: RuntimeWarning: invalid value encountered in double_scalars
>>> norm
{1: [nan], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}

所以我插入了一个if statement 来解决这个问题,但我无法将每个键的值存储为 ID 数组

代码

>>> norm = {}
>>> for k, vals in data.items():
...     values = []
...     if sum(vals) == 0:
...        values.append(list(vals))
...     else:
...          for v in vals:
...              values.append(list([v/sum(vals)]))
...     norm[k]=values
... 
>>> norm
{1: [[1.0]], 2: [[0.4498162176582741], [0.4498162176582741], [0.10036756468345168]], 3: [[0.4498162176582741], [0.4498162176582741], [0.10036756468345168]], 4: [[0.5], [0.5]]}

我想将norm 字典设为

norm = {1: [1.0], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}

另外,对于字典data,虽然它包含一个零值,如果它是键,有没有更好的解决方案来规范它,因为我认为我的解决方案效率不高!

P.S:我在 for 循环的末尾尝试了 norm[k]= np.array(values) 而不是 norm[k]=values,但结果不符合要求。

【问题讨论】:

  • 将您的append 都更改为extend。此外,无需从正在扩展的内容中创建list,它已经是一个列表

标签: python arrays python-3.x dictionary for-loop


【解决方案1】:

sum(vals) == 0 时,您的 dict/list 理解失败:

>>> data = {1: [0.0], 2: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 3: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 4: [0.6065306597126334, 0.6065306597126334]}
>>> {k: [v / sum(vals) for v in vals] for k, vals in data.items()}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <dictcomp>
  File "<stdin>", line 1, in <listcomp>
ZeroDivisionError: float division by zero

可以引入三元表达式来处理这种情况:

>>> {k: [v / sum(vals) if sum(vals)!=0 else 1.0 for v in vals] for k, vals in data.items()}
{1: [1.0], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}

如果你想避免多次评估sum(vals)

>>> {k: [v / s if s!=0 else 1.0 for v in vals] for k,vals,s in ((k, vals, sum(vals)) for k, vals in data.items())}
{1: [1.0], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}

((k, vals, sum(vals)) for k, vals in data.items()) 是一个生成器,它为每个项目返回 kvalssum(vals)

【讨论】:

    【解决方案2】:

    如答案中所述,extend 可用于解决您的问题。如果您确实想使用append,您可以使用列表的第一个元素。

    norm = {}
    for k, vals in data.items():
        values = []
        if sum(vals) == 0:
            values.append(vals[0])
        else:
            for v in vals:
                values.append([v / sum(vals)][0])
        norm[k] = values
    

    有关追加与扩展的示例,请参阅 difference between append vs extend list methods in python

    至于优化。完全删除 for 循环是不可能的,但您可以缩短解决方案,同时仍保持可读性:

    norm = {}
    for k, vals in data.items():
        if sum(vals) == 0:
            norm[k] = vals
        else:
            norm[k] = [x / sum(vals) for x in vals]
    

    【讨论】:

    • 感谢您的帮助。有没有更有效的方法来获取这本字典而不是所有这些循环?
    • @Noah16 我已经更新了我的答案。如果它解决了您的问题,请点赞并接受我的回答。
    【解决方案3】:

    append 如上所述将一个元素添加到列表中,并且该元素可以是列表,这就是为什么您当前在列表中有一个列表的原因。理想情况下,您应该使用extend,它将第一个列表与另一个列表连接起来。

    【讨论】:

      猜你喜欢
      • 2013-09-20
      • 2014-03-29
      • 2016-11-08
      • 1970-01-01
      • 2016-11-05
      • 1970-01-01
      • 2020-01-22
      • 1970-01-01
      • 2016-06-17
      相关资源
      最近更新 更多