【问题标题】:IsADirectoryError: [Errno 21] Is a directory: '/' error while using multiprocessingIsADirectoryError:[Errno 21] 是目录:使用多处理时出现“/”错误
【发布时间】:2020-11-20 07:02:40
【问题描述】:

说,我有一个函数可以在一个列表中运行多个数据帧。像这样,

listdF = [os.path.join(os.sep,path,x) for x in os.listdir(path) if x.endswith('.csv')]
def corre_arrys(listdF):
   data = []
for files in listdF:
    df = pd.read_csv(files,sep='\t',header=0,engine='python')
    #do something
return(df)
        

当我尝试按原样运行上述函数时,没有错误。它打印出我需要的输出。但是,当我尝试使用multiprocessing 运行它时,如下所示,

from multiprocessing import Pool
NUM_PROCS = 8    
pool = Pool(processes=NUM_PROCS)
allDfs = pool.map(corre_arrys,listdF)

它正在抛出以下错误消息,

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/alva/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/alva/anaconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "<ipython-input-42-e4b97b52ffff>", line 4, in corre_arrys
    df = pd.read_csv(files,sep='\t',header=0,engine='python')
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
    self._make_engine(self.engine)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1126, in _make_engine
    self._engine = klass(self.f, **self.options)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 2269, in __init__
    memory_map=self.memory_map,
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/common.py", line 431, in get_handle
    f = open(path_or_buf, mode, errors="replace", newline="")
IsADirectoryError: [Errno 21] Is a directory: '/'
"""

The above exception was the direct cause of the following exception:

IsADirectoryError                         Traceback (most recent call last)
<ipython-input-46-4971753cdf30> in <module>
      4 NUM_PROCS = 8
      5 pool = Pool(processes=NUM_PROCS)
----> 6 allDfs = pool.map(corre_arrys,listdF)

~/anaconda3/lib/python3.7/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):

~/anaconda3/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
    655             return self._value
    656         else:
--> 657             raise self._value
    658 
    659     def _set(self, i, obj):

IsADirectoryError: [Errno 21] Is a directory: '/'

listDF 如下所示,既有路径又有文件。

['/path/scripts/pc_2_lc_1_T.csv',
 '/path/scripts/pc_2_lc_2_T.csv',
 '/path/scripts/pc_1_lc_1_T.csv',
 '/path/scripts/pc_1_lc_2_T.csv']

我不明白问题出在哪里。

非常感谢任何帮助。谢谢!!

【问题讨论】:

  • “listDF”中的第一个路径是相对的。尽量避免这种情况。
  • @MichaelButscher,这是一个错字。实际上,所有路径都是绝对的

标签: python pandas multithreading list numpy


【解决方案1】:

从您的堆栈跟踪来看,您的listdF 中似乎有一个目录正在蔓延,而pandas.read_csv() 尝试加载该目录失败。尝试明确过滤掉目录: listDf = [x for x in os.listdir(path) if os.path.isfile(os.path.join(path, x)) and x.endswith('.csv')]

【讨论】:

  • 感谢您的解决方案,但是,如果数据帧列表中没有路径,该函数会抛出 FileNotFoundError 错误,FileNotFoundError: [Errno 2] No such file or directory: 'p'listDF['pc_2_lc_1_T.csv', 'pc_2_lc_2_T.csv', 'pc_1_lc_1_T.csv', 'pc_1_lc_2_T.csv']
  • 在你read_csv之前尝试打印files。当 read_csv 传递一个目录而不是 csv 时,您的跟踪清楚地表明错误源自函数的第 4 行。
  • print(files),给出所有文件及其绝对路径。
猜你喜欢
  • 2021-06-13
  • 2022-12-07
  • 2018-06-20
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2022-08-22
相关资源
最近更新 更多