如何解决pytables和h5py中没有这样的节点错误答案

【问题标题】：How to solve no such node error in pytables and h5py如何解决pytables和h5py中没有这样的节点错误
【发布时间】：2022-01-13 00:50:07
【问题描述】：

我使用 pytables 构建了一个 hdf5 数据集。它包含数千个节点，每个节点都是未经压缩存储的图像（形状为 512x512x3）。当我在其上运行深度学习训练循环（使用 Pytorch 数据加载器）时，它随机崩溃，说该节点不存在。但是，它永远不会丢失同一个节点，当我自己打开文件以验证节点是否在这里时，它总是在这里。

我按顺序运行所有内容，因为我认为我可能是文件上的多线程/多处理访问的错误。但它并没有解决问题。我尝试了很多东西，但都没有成功。

有人知道该怎么做吗？我应该在调用之间添加一个计时器，让机器有时间重新分配文件吗？

最初我只使用 pytables，但为了解决我的问题，我尝试使用 h5py 加载文件。不幸的是，它并没有更好地工作。

这是我在使用 h5py 时遇到的错误：“RuntimeError: Unable to get link info (bad symbol table node signature)”

确切的错误可能会改变，但每次都会显示“错误的符号表节点签名”

PS：我无法共享代码，因为它很大，并且是我公司财产的更大基码的一部分。我仍然可以分享下面的部分代码来展示我是如何加载图像的：

with h5py.File(dset_filepath, "r", libver='latest', swmr=True) as h5file:
    node = h5file["/train_group_0/sample_5"] # <- this line breaks
    target = node.attrs.get('TITLE').decode('utf-8')
    img = Image.fromarray(np.uint8(node))
    return img, int(target.strip())

【问题讨论】：

标签： python-3.x pytorch h5py pytables

【解决方案1】：

在访问数据集（节点）之前，添加一个测试以确认它存在。在添加检查时，对属性'TITLE' 执行相同操作。如果要使用硬编码的路径名（如'group_0'），则应检查路径中的所有节点是否存在（例如，'group_0' 是否存在？或使用递归访问者函数之一（.visit() 或 @ 987654325@ 以确保您只访问现有节点。

经过基本检查的修改后的 h5py 代码如下所示：

sample = 'sample_5' 
with h5py.File(dset_filepath, 'r', libver='latest', swmr=True) as h5file:
    if sample not in h5file['/train_group_0'].keys():
        print(f'Dataset Read Error: {sample} not found')
        return None, None
    else:
        node = h5file[f'/train_group_0/{sample}'] # <- this line breaks
        img = Image.fromarray(np.uint8(node))
        if 'TITLE' not in node.attrs.keys():
            print(f'Attribute Read Error: TITLE not found')
            return img, None
        else:
            target = node.attrs.get('TITLE').decode('utf-8')
            return img, int(target.strip())

您说您正在使用 PyTables。以下是对 PyTables 包执行相同操作的代码：

import tables as tb
sample = 'sample_5'
with tb.File(dset_filepath, 'r', libver='latest', swmr=True) as h5file:
    if sample not in h5file.get_node('/train_group_0'):
        print(f'Dataset Read Error: {sample} not found')
        return None, None
    else:
        node = h5file.get_node(f'/train_group_0/{sample}') # <- this line breaks
        img = Image.fromarray(np.uint8(node))
        if 'TITLE' not in node._v_attrs:
            print(f'Attribute Read Error: TITLE not found')
            return img, None
        else:
            target = node._v_attrs['TITLE'].decode('utf-8')
            return img, int(target.strip())

【讨论】：

谢谢，我会试试的：）
如果您希望继续使用该包，请添加类似的 PyTables 示例。
最后我的代码中有一个错误，您的代码帮助我找到了错误：在运行时未找到密钥，但在运行时之外，我的检查脚本找到了它，因为我正在循环 len(数据集）。但是在我的运行时代码中，我不由自主地授权去 len(dataset)+1。我发现这也是因为它是同一个密钥一次又一次地失败，而不是一开始它是随机的。所以我认为我删除并行性也有帮助。我可以说这是错误的总和。感谢您的帮助！
是的，发生这种情况 - 当您认为您正在打开一个有效的数据集时出现错误... :-)