【发布时间】:2020-04-23 22:37:44
【问题描述】:
我有很多文件,我把它分成五个一组。我想遍历每组块。我不想一一更改元素,因为有超过 500 个组。有没有办法循环遍历它?
import glob
import numpy as np
import pandas as pd
path = r'/Users/Documents/Data'
files= sorted(glob.glob(path + '/**/*.dat', recursive=True))
chunks = [files[x:x+5] for x in range(0, len(files), 5)]. #group 5 files at a time
chunks = [['file1.dat', 'file2.dat', 'file3.data', 'file4.dat', 'file5.dat'],
['file6.dat', 'file7.dat', 'file8.dat', 'file9.dat', 'file10.dat'], [...]]```
这项工作,但我不想手动更改元素 500 次。
df=[]
for i in chunks[0]:
indat = pd.read_fwf(i, skiprows=4, header=None, engine='python')
indat = df.append(indat)
indat = pd.concat(df, axis=0, ignore_index=False)
我想试试loop。
df=[]
for i, file in enumerate(chunks,1):
indat = pd.read_fwf(file, skiprows=4, header=None, engine='python')
indat = df.append(indat)
我的尝试给了我以下错误:
File "/Users/Documents/test.py", line 30, in <module>
indat = pd.read_fwf(file, skiprows=4, header=None, engine='python')
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 782, in read_fwf
return _read(filepath_or_buffer, kwds)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 431, in _read
filepath_or_buffer, encoding, compression
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/common.py", line 200, in get_filepath_or_buffer
raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'list'>```
【问题讨论】:
-
为什么你声明
chunks只是为了立即覆盖它?与indat相同 -
你想要内存中的所有数据帧吗?
标签: python-3.x pandas loops