存储 Python 字典答案

【问题标题】：Storing Python dictionaries存储 Python 字典
【发布时间】：2011-10-29 07:57:25
【问题描述】：

我习惯于使用 CSV 文件将数据导入和导出 Python，但这显然存在挑战。有没有简单的方法将字典（或字典集）存储在 JSON 或 pickle 文件中？

例如：

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

我想知道如何保存它，以及如何重新加载它。

【问题讨论】：

您阅读过json 或pickle 标准模块的文档吗？
见Save a dictionary to a file (alternative to pickle) in Python?

标签： python json dictionary save pickle

【解决方案1】：

Pickle保存：

try:
    import cPickle as pickle
except ImportError:  # Python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

有关protocol 参数的更多信息，请参阅the pickle module documentation。

Pickle加载：

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON保存：

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

提供额外的参数，例如 sort_keys 或 indent，以获得漂亮的结果。参数 sort_keys 将按字母顺序对键进行排序，indent 将使用indent=N 空格缩进您的数据结构。

json.dump(data, fp, sort_keys=True, indent=4)

JSON加载：

with open('data.json', 'r') as fp:
    data = json.load(fp)

【讨论】：

JSON 本身就是字典（尽管它们在内存中的行为显然与 python 字典的行为不完全相同，但出于持久性目的，它们是相同的）。事实上，json 中的基础单元是“对象”，它被定义为 { : }。看起来熟悉？标准库中的 json 模块支持所有 Python 原生类型，并且可以通过对 json 的最少了解轻松扩展以支持用户定义的类。 The JSON homepage 在 3 多页的印刷页面中完全定义了语言，因此很容易快速吸收/消化。
pickle.dump 的第三个参数也值得了解。如果文件不需要是人类可读的，那么它可以大大加快速度。
如果你在转储调用中添加 sort_keys 和 indent 参数，你会得到一个更漂亮的结果。例如：json.dump(data, fp, sort_keys=True, indent=4)。更多信息可以找到here
你应该使用pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)
对于python 3，使用import pickle

【解决方案2】：

最小的例子，直接写入文件：

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

或安全打开/关闭：

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

如果你想把它保存在一个字符串而不是一个文件中：

import json
json_str = json.dumps(data)
data = json.loads(json_str)

【讨论】：

【解决方案3】：

另见加速包ujson：

import ujson

with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)

【讨论】：

这个包能做 json 做的所有事情吗？我的意思是它一直都可以完全替换为 json 吗？

【解决方案4】：

写入文件：

import json
myfile.write(json.dumps(mydict))

从文件中读取：

import json
mydict = json.loads(myfile.read())

myfile 是您存储字典的文件的文件对象。

【讨论】：

您知道 json 具有将文件作为参数并直接写入它们的功能吗？
json.dump(myfile) 和 json.load(myfile)

【解决方案5】：

如果您想要替代pickle 或json，可以使用klepto。

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

对于klepto，如果您使用了serialized=True，则字典将作为腌制字典而不是明文写入memo.pkl。

您可以在此处获取klepto：https://github.com/uqfoundation/klepto

dill 可能是酸洗比pickle 本身更好的选择，因为dill 可以在python 中序列化几乎任何东西。 klepto也可以使用dill。

您可以在此处获取dill：https://github.com/uqfoundation/dill

前几行中的额外的 mumbo-jumbo 是因为 klepto 可以配置为将字典存储到文件、目录上下文或 SQL 数据库中。无论您选择什么作为后端存档，API 都是相同的。它为您提供了一个“可存档”字典，您可以使用它使用 load 和 dump 与存档进行交互。

【讨论】：

【解决方案6】：

如果你在序列化之后，但不需要其他程序中的数据，我强烈推荐shelve模块。将其视为持久字典。

myData = shelve.open('/path/to/file')

# Check for values.
keyVar in myData

# Set values
myData[anotherKey] = someValue

# Save the data for future use.
myData.close()

【讨论】：

如果要存储整个dict，或者加载整个dict，json更方便。 shelve 只适合一次访问一个键。

【解决方案7】：

为了完整起见，我们应该包括 ConfigParser 和 configparser，它们分别是 Python 2 和 3 中标准库的一部分。该模块读取和写入 config/ini 文件，并且（至少在 Python 3 中）在很多方面表现得像字典。它还有一个额外的好处，就是您可以将多个字典存储到 config/ini 文件的不同部分并调用它们。甜！

Python 2.7.x 示例。

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])
   
config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X 示例。

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

控制台输出

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

config.ini 的内容

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1

【讨论】：

【解决方案8】：

如果保存到 JSON 文件，最好和最简单的方法是：

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))

【讨论】：

为什么这比其他答案中概述的json.dump( ) 更容易？

【解决方案9】：

我的用例是将多个 JSON 对象保存到一个文件中，marty's answer 对我有所帮助。但是为了满足我的用例，答案并不完整，因为每次保存新条目时它都会覆盖旧数据。

要在一个文件中保存多个条目，必须检查旧内容（即先读后写）。保存 JSON 数据的典型文件将具有 list 或 object 作为根。所以我认为我的 JSON 文件总是有一个 list of objects 并且每次我向其中添加数据时，我只需先加载列表，将我的新数据附加到其中，然后将其转储回一个可写的文件实例（@ 987654325@):

def saveJson(url,sc): # This function writes the two values to the file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  # Read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  # Overwrite the whole content
        json.dump(old_list, myfile, sort_keys=True, indent=4)

    return "success"

新的 JSON 文件将如下所示：

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

注意：必须有一个名为 file.json 和 [] 的文件作为此方法工作的初始数据

PS：与原始问题无关，但这种方法也可以通过首先检查我们的条目是否已经存在（基于一个或多个键）然后才附加并保存数据来进一步改进。

【讨论】：