【发布时间】:2022-01-08 19:29:39
【问题描述】:
以下是我的 .hdf5 文件中数据分支的屏幕截图。我正在尝试从这个特定的 BlinkStartEvent 段中提取现有的列名(即experiment_id、session_id....)。
我有以下代码可以访问这部分数据并提取数字数据。但由于某种原因,我无法提取相应的列名,我希望将其附加到单独的列表中,以便可以从整个数据集中创建一个字典。我以为 .keys() 应该这样做,但它没有。
import h5py
def traverse_datasets(hdf_file):
def h5py_dataset_iterator(g, prefix=''):
for key in g.keys():
#print(key)
item = g[key]
path = f'{prefix}/{key}'
if isinstance(item, h5py.Dataset): # test for dataset
yield (path, item)
elif isinstance(item, h5py.Group): # test for group (go down)
yield from h5py_dataset_iterator(item, path)
for path, _ in h5py_dataset_iterator(hdf_file):
yield path
with h5py.File(filenameHDF[0], 'r') as f:
for dset in traverse_datasets(f):
if str(dset[-15:]) == 'BlinkStartEvent':
print('-----Path:', dset) # path that leads to the data
print('-----Shape:', f[dset].shape) #the length dimension of the data
print('-----Data type:', f[dset].dtype) #prints out the unicode for all columns
data2 = f[dset][()] # The entire dataset
# print('Check column names', f[dset].keys()) # I tried this but I got a AttributeError: 'Dataset' object has no attribute 'keys' error
我得到以下输出:
-----Path: /data_collection/events/eyetracker/BlinkStartEvent
-----Shape: (220,)
-----Data type: [('experiment_id', '<u4'), ('session_id', '<u4'), ('device_id', '<u2'), ('event_id', '<u4'), ('type', 'u1'), ('device_time', '<f4'), ('logged_time', '<f4'), ('time', '<f4'), ('confidence_interval', '<f4'), ('delay', '<f4'), ('filter_id', '<i2'), ('eye', 'u1'), ('status', 'u1')]
Traceback (most recent call last):
File "C:\Users\angjw\Dropbox\NUS PVT\Analysis\PVT analysis_hdf5access.py", line 64, in <module>
print('Check column names', f[dset].keys())
AttributeError: 'Dataset' object has no attribute 'keys'
我在这里做错了什么?
另外,有没有更有效的方法来访问数据,这样我就可以做一些(假设的)事情,比如:
data2[0]['experiment_id'] = 1
data2[1]['time'] = 78.35161
data2[2]['logged_time'] = 80.59253
而不必为每一行数据都设置一个字典?
【问题讨论】:
标签: python-3.x hdf5 h5py