【发布时间】:2021-11-02 09:54:11
【问题描述】:
我正在尝试从 .gro 文件中逐行读取数据,并希望将其写入数据为 .h5 文件格式。但得到 Typeerror:"No conversion path ford type: type('<U7')"。我猜读取的数据是字符串格式的。我尝试使用 np.arrays 将其转换为数组,但它不起作用。谁能帮我解决这个问题?还是有更好的方法来读取数据?我无法使用np.loadtxt,因为数据大小约为 50GB。
.gro 文件的格式如下所示
Generated by trjconv : P/L=1/400 t= 0.00000
11214
1P1 aP1 1 80.48 35.36 4.25
2P1 aP1 2 37.45 3.92 3.96
Generated by trjconv : P/L=1/400 t= 10.00000
11214
1P1 aP1 1 80.48 35.36 4.25
2P1 aP1 2 37.45 3.92 3.96
Generated by trjconv : P/L=1/400 t= 20.00000
11214
1P1 aP1 1 80.48 35.36 4.25
2P1 aP1 2 37.45 3.92 3.96
Generated by trjconv : P/L=1/400 t= 30.00000
11214
1P1 aP1 1 80.48 35.36 4.25
2P1 aP1 2 37.45 3.92 3.96
Generated by trjconv : P/L=1/400 t= 40.00000
11214
1P1 aP1 1 80.48 35.36 4.25
2P1 aP1 2 37.45 3.92 3.96
错误:
ValueError: Some errors were detected !
Line #5 (got 7 columns instead of 6)
Line #6 (got 1 columns instead of 6)
Line #9 (got 7 columns instead of 6)
Line #10 (got 1 columns instead of 6)
Line #13 (got 7 columns instead of 6)
Line #14 (got 1 columns instead of 6)
Line #17 (got 7 columns instead of 6)
Line #18 (got 1 columns instead of 6)
这是我的小代码:
import h5py
import numpy as np
# First step is to read .gro file
f = open('pep.gro', 'r')
data = f.readlines()
for line in data:
reading = line.split()
#print(type(reading))
#dat = np.array(reading).astype(int)
# Next step is to write the data to .h5 file
with h5py.File('pep1.h5', 'w') as hdf:
hdf.create_dataset('dataset1', data=reading)
【问题讨论】: