以变量为键的 Python 字典答案

【问题标题】：Python dictionary with variables as keys以变量为键的 Python 字典
【发布时间】：2012-01-07 18:44:15
【问题描述】：

我是一个 Python 新手，我试图解析一个文件来制作一个内存分配表。我的输入文件格式如下：

48 bytes allocated at 0x8bb970a0
24 bytes allocated at 0x8bb950c0
48 bytes allocated at 0x958bd0e0
48 bytes allocated at 0x8bb9b060
96 bytes allocated at 0x8bb9afe0
24 bytes allocated at 0x8bb9af60

我的第一个目标是制作一个表格来计算特定数量的字节分配实例。换句话说，我对上述输入的期望输出是这样的：

48 bytes -> 3 times
96 bytes -> 1 times
24 bytes -> 2 times

（目前，我不关心内存地址）

由于我使用的是 Python，我认为使用字典来做这件事是正确的方法（基于大约 3 小时的 Python 教程阅读时间）。这是个好主意吗？

在尝试使用字典执行此操作时，我决定将字节数设为“键”，将计数器设为“值”。我的计划是在每次出现键时递增计数器。截至目前，我的代码sn-p如下：

# Create an empty dictionary
allocationList = {}

# Open file for reading
with open("allocFile.txt") as fp: 
    for line in fp: 
        # Split the line into a list (using space as delimiter)
        lineList = line.split(" ")

        # Extract the number of bytes
        numBytes = lineList[0];

        # Store in a dictionary
        if allocationList.has_key('numBytes')
            currentCount = allocationList['numBytes']
            currentCount += 1
            allocationList['numBytes'] = currentCount
        else
            allocationList['numBytes'] = 1 

for bytes, count in allocationList.iteritems()
    print bytes, "bytes -> ", count, " times"

这样，我在“has_key”调用中遇到语法错误，这让我怀疑是否甚至可以将变量用作字典键。到目前为止，我看到的所有示例都假定密钥是预先可用的。就我而言，只有在解析输入文件时才能获取密钥。

（请注意，我的输入文件可能有数千行，有数百个不同的键）

感谢您提供的任何帮助。

【问题讨论】：

我看到你引用了'numBytes'，所以，你总是指的是常量
你在if allocationList.has_key('numBytes')和else之后的行中省略了冒号——应该是语法错误

标签： python

【解决方案1】：

字典的dict.has_key()方法有disappeared in python3，要替换它，使用in关键字：

if numBytes in allocationList:    # do not use numBytes as a string, use the variable directly
    #do the stuff

但在你的情况下，你也可以替换所有的

if allocationList.has_key('numBytes')
            currentCount = allocationList['numBytes']
            currentCount += 1
            allocationList['numBytes'] = currentCount
        else
            allocationList['numBytes'] = 1

一行与get:

allocationList[numBytes] = allocationList.get(numBytes, 0) + 1

【讨论】：

使用setdefault不需要设置两次值；请改用dict.get。
@FerdinandBeyer：你说得对，使用 setdefault 有点矫枉过正和没用。
删除了“has_key”并使用了“in”。谢谢你的提示。我可能正在阅读一些过时的教程。

【解决方案2】：

您绝对可以将变量用作字典键。但是，您有一个名为 numBytes 的变量，但正在使用包含文本 "numBytes" 的字符串 - 您使用的是字符串常量，而不是变量。这不会导致错误，但这是一个问题。相反，请尝试：

if numBytes in allocationList:
    # do stuff

另外，考虑Counter。这是处理您正在查看的案例的方便类。

【讨论】：

【解决方案3】：

您收到语法错误，因为您缺少此行末尾的冒号：

if allocationList.has_key('numBytes')
                                     ^

您的方法很好，但使用带有默认值的dict.get() 可能更容易：

allocationList[numBytes] = allocationList.get(numBytes, 0) + 1

由于您的 allocationList 是字典而不是列表，您可能需要为变量选择不同的名称。

【讨论】：

谢谢。我对“：”一无所知。刚刚发现我在“for”语句的末尾也需要一个。

【解决方案4】：

学习一门语言与学习标准库一样重要。 Python 已经有一个类可以让你的任务变得非常简单：collections.Counter。

from collections import Counter

with open("allocFile.txt") as fp:
    counter = Counter(line.split()[0] for line in fp)

for bytes, count in counter.most_common():
    print bytes, "bytes -> ", count, " times"

【讨论】：

我觉得你的回答比这里的任何人都真实
+1：如果您只对计数感兴趣，Counter 是您的最佳选择。另一方面，OP 写道：现在，我不关心内存地址 --- 我想他迟早会需要一个超越 Counter 的自定义解决方案。
非常感谢您提供此解决方案。我试过了，但没有用。这是因为 Counter 仅适用于 Python > 2.7，而我使用的是 2.6.4。但它让我找到了：stackoverflow.com/questions/3594514/…，在这里我找到了解决问题的方法。但我将此答案标记为解决方案，因为这可能是解决问题的最佳方法。