【发布时间】:2018-07-10 03:15:33
【问题描述】:
我在并行计算集群的不同处理器上将 Python 3.6 脚本作为多个单独的进程运行。
最多 35 个进程同时运行没有问题,但第 36 个(以及更多)进程在第二行出现分段错误import pandas as pd 时崩溃。有趣的是,第一行 import os 不会引起问题。
完整的错误信息是:
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2067021 max
Traceback (most recent call last):
File "/home/.../myscript.py", line 32, in <module>
import pandas as pd
File "/home/.../python_venv2/lib/python3.6/site-packages/pandas/__init__.py", line 13, in <module>
__import__(dependency)
File "/home/.../python_venv2/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
from . import add_newdocs
File "/home/.../python_venv2/lib/python3.6/site-packages/numpy/add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "/home/.../python_venv2/lib/python3.6/site-packages/numpy/lib/__init__.py", line 8, in <module>
from .type_check import *
File "/home/.../python_venv2/lib/python3.6/site-packages/numpy/lib/type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "/home/.../python_venv2/lib/python3.6/site-packages/numpy/core/__init__.py", line 16, in <module>
from . import multiarray
SystemError: initialization of multiarray raised unreported exception
/var/spool/slurmd/job04590/slurm_script: line 11: 26963 Segmentation fault python /home/.../myscript.py -x 38
Pandas 和其他一些软件包安装在虚拟环境中。我已经复制了虚拟环境,因此每个 venv 中运行的进程不超过 24 个。例如,上面的错误脚本来自一个在虚拟环境中运行的名为python_venv2的脚本。
无论有多少进程从特定的 Pandas 实例导入,每次都会在第 36 个进程上出现问题。 (我什至没有降低并行计算集群的容量。)
那么,如果不是限制访问 Pandas 的进程数,是不是限制了运行 Python 的进程数?为什么限制为 35?
是否可以在机器上安装多个 Python 副本(在不同的虚拟环境中?)以便我可以运行 35 个以上的进程?
【问题讨论】:
-
您的用户在被命中的集群上似乎存在线程限制。无论如何,这与 Python 无关,而是与操作系统相关。
-
尝试从命令行运行
export OPENBLAS_NUM_THREADS=1。 -
您使用的是哪个操作系统?
-
export OPENBLAS_NUM_THREADS=1似乎已经解决了这个问题非常感谢@Richard - 你能告诉我我在那里做了什么吗?
标签: python linux parallel-processing multiprocessing openblas