概念
这是一个通用的解决方案,可以满足您的需求。它使用的概念是递归地遍历顶级“persons”字典的所有值。根据它找到的每个值的类型,它会继续。
因此,对于它在每个字典中找到的所有非字典/非列表,它会将它们放入您需要的顶级对象中。
或者,如果它找到字典或列表,它会再次递归地做同样的事情,找到更多的非字典/非列表或列表或字典。
同时使用 collections.defaultdict 可以让我们轻松地将每个键的未知数量的列表填充到字典中,这样我们就可以获得你想要的那 4 个顶级对象。
代码示例
from collections import defaultdict
class DictFlattener(object):
def __init__(self, object_id_key, object_name):
"""Constructor.
:param object_id_key: String key that identifies each base object
:param object_name: String name given to the base object in data.
"""
self._object_id_key = object_id_key
self._object_name = object_name
# Store each of the top-level results lists.
self._collected_results = None
def parse(self, data):
"""Parse the given nested dictionary data into separate lists.
Each nested dictionary is transformed into its own list of objects,
associated with the original object via the object id.
:param data: Dictionary of data to parse.
:returns: Single dictionary containing the resulting lists of
objects, where each key is the object name combined with the
list name via an underscore.
"""
self._collected_results = defaultdict(list)
for value_to_parse in data[self._object_name]:
object_id = value_to_parse[self._object_id_key]
parsed_object = {}
for key, value in value_to_parse.items():
sub_object_name = self._object_name + "_" + key
parsed_value = self._parse_value(
value,
object_id,
sub_object_name,
)
if parsed_value:
parsed_object[key] = parsed_value
self._collected_results[self._object_name].append(parsed_object)
return self._collected_results
def _parse_value(self, value_to_parse, object_id, current_object_name, index=None):
"""Parse some value of an unknown type.
If it's a list or a dict, keep parsing, otherwise return it as-is.
:param value_to_parse: Value to parse
:param object_id: String id of the current top object being parsed.
:param current_object_name: Name of the current level being parsed.
:returns: None if value_to_parse is a dict or a list, otherwise returns
value_to_parse.
"""
if isinstance(value_to_parse, dict):
self._parse_dict(
value_to_parse,
object_id,
current_object_name,
index=index,
)
elif isinstance(value_to_parse, list):
self._parse_list(
value_to_parse,
object_id,
current_object_name,
)
else:
return value_to_parse
def _parse_dict(self, dict_to_parse, object_id, current_object_name,
index=None):
"""Parse some value of a dict type and store it in self._collected_results.
:param dict_to_parse: Dict to parse
:param object_id: String id of the current top object being parsed.
:param current_object_name: Name of the current level being parsed.
"""
parsed_dict = {
self._object_id_key: object_id,
}
if index is not None:
parsed_dict["__index"] = index
for key, value in dict_to_parse.items():
sub_object_name = current_object_name + "_" + key
parsed_value = self._parse_value(
value,
object_id,
sub_object_name,
index=index,
)
if parsed_value:
parsed_dict[key] = value
self._collected_results[current_object_name].append(parsed_dict)
def _parse_list(self, list_to_parse, object_id, current_object_name):
"""Parse some value of a list type and store it in self._collected_results.
:param list_to_parse: Dict to parse
:param object_id: String id of the current top object being parsed.
:param current_object_name: Name of the current level being parsed.
"""
for index, sub_dict in enumerate(list_to_parse):
self._parse_value(
sub_dict,
object_id,
current_object_name,
index=index,
)
然后使用它:
parser = DictFlattener("id", "persons")
results = parser.parse(test_data)
注意事项
- 您的示例数据与预期数据存在一些不一致,例如分数是字符串还是整数。因此,当您比较给定值和预期值时,您需要调整这些值。
- 总是有更多的重构可以做,或者它可以变得更实用,而不是成为一个类。但希望看到这个可以帮助您了解如何做到这一点。
- 正如@jbernardo 所说,如果您要将这些插入到关系数据库中,它们不应该都只有“id”作为键,而应该是“person_id”。