【问题标题】:Raise limitations in CPU use when running singularity containers运行奇异容器时提高 CPU 使用限制
【发布时间】:2020-06-04 08:30:13
【问题描述】:

我在 python 3 中开发了一些(复杂的)代码。当我从 shell(Ubuntu 18.04)在笔记本电脑上运行它时,CPU 使用率为 550%(来自“top”命令)。当我从奇异容器(基于 Ubuntu 16.04)运行它时,CPU 使用率为 250%,执行时间增加。我无法弄清楚为什么奇异点不能使用更多的 CPU。

我阅读了https://sylabs.io/guides/3.0/admin-guide/configfiles.html#singularity-conf 的手册,但我的singularity.conf 是默认文件,我也没有创建任何/sys/fs/cgroup 文件。奇点版本是3.0.3。

有人知道这个问题吗?

谢谢!

JB

编辑:这个案例可以用下面的简单例子重现:

使用:python3 nb_cpu_singularity.py 300000 10000

nb_cpu_singularity.py:

import numpy as np
import numba as nb
import argparse


parser = argparse.ArgumentParser(description="compute dot products")
parser.add_argument("sample_size", type=int, default=10000,
                    help="number of dot products")
parser.add_argument("dim", type=int, default=1000,
                    help="dimension of vectors")

args = parser.parse_args()

# Inputs
sample_size = args.sample_size
dim = args.dim

@nb.jit(nopython=True, nogil=True, fastmath=True, parallel=False)
def build_vector(offset, dim):

    v = np.zeros(dim, dtype=np.float64)
    for i in range(dim):
        v[i] += i+offset
    return(v)


@nb.jit(nopython=True, nogil=True, fastmath=True, parallel=False)
def dot_products(sample_size, dim):

    for i in range(sample_size):
        np.dot(build_vector(i, dim), build_vector(i+1, dim))        


dot_products(sample_size, dim)

编辑:按照 Jakub 的回答,我添加了两个产生不同行为的奇点配方。

Bootstrap: docker
From: ubuntu:18.04

# .def files for Singularity image to be used with bnp-mrf for count data.
# Includes R packages for post-processing

# Tips:
#   + Use export TMPDIR=my_tmp_dir to specify the directory for temporary files
#   + Build images as root: sudo singularity build ...

# Tested with singularity 3.0.3

%help
This singularity image contains python libraries to run BNP MRF models without tensorflow.
You may run the image by using
singularity run --app jupyter -e -B /my_scratch:/scratch:rw notensorflow-1-4-1_minimal_count.simg
where /my_scratch is the name of a host directory containing some jupyter notebook(s) you want to run withing the container and assuming notensorflow-1-4-1_minimal_count.simg is the name of the file produce by singularity build on the present definition file.
If you just want to run an ipython console, use
singularity run --app console notensorflow-1-4-1_minimal_count.simg


%labels
BUILD.CMD="sudo singularity build notensorflow-1-4-1_minimal_count.simg make_simg_count_data_minimal.singularity"

%setup

# Just an example, not used here
%files
#basic_classification.py     /opt/scripts/

%environment
export LANG="C.UTF-8" LC_ALL="C.UTF-8"

%post

export TZ=Europe/Minsk

apt update && DEBIAN_FRONTEND=noninteractive apt install -y gedit python3-pip llvm software-properties-common apt-transport-https

# R installation
export R_REPOS="https://cloud.r-project.org"
# apt-key adv --keyserver keys.gnupg.net --recv-key 'E19F5F87128899B192B1A2C2AD5F960A256A04AF' ?
# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 51716619E084DAB9
add-apt-repository "deb $R_REPOS/bin/linux/ubuntu bionic-cran35/"
export R_VERSION="3.6.3-1bionic"
apt update && DEBIAN_FRONTEND=noninteractive apt install -y r-base=$R_VERSION libudunits2-dev libgdal-dev

# Python env
python3 -m pip install --upgrade pip

python3 -m pip install llvmlite matplotlib numba numpy opencv-python pandas scikit-image scikit-learn scipy ipython jupyterlab rpy2 tbb

rm -rf /var/lib/apt/lists/*

%runscript
cd /scratch
ipython3

%apprun console
cd /scratch
ipython

%apprun jupyter
cd /scratch/
jupyter lab

另一张图片是

Bootstrap: docker
From: ubuntu:16.04

# .def files for Singularity image to be used with bnp-mrf for count data.
# Includes R packages for post-processing

# Tips:
#   + Use export TMPDIR=my_tmp_dir to specify the directory for temporary files
#   + Build images as root: sudo singularity build ...

# Tested with singularity 3.0.3

% help
This singularity image contains python libraries to run BNP MRF models without tensorflow.
You may run the image by using
singularity run --app jupyter -e -B /my_scratch:/scratch:rw notensorflow-1-4-1_cpu_count.simg
where /my_scratch is the name of a host directory containing some jupyter notebook(s) you want to run withing the container and assuming notensorflow-1-4-1_cpu_count.simg is the name of the file produce by singularity build on the present definition file.
If you just want to run an ipython console, use
singularity run --app console notensorflow-1-4-1_cpu_count.simg


%labels
BUILD.CMD="sudo singularity build notensorflow-1-4-1_cpu_count.simg make_simg_count_data_cpu.singularity"

%setup
%mkdir -p ${SINGULARITY_ROOTFS}/r_analysis

# Just an example, not used here
%files
%basic_classification.py     /opt/scripts/

%environment
export LANG="C.UTF-8" LC_ALL="C.UTF-8"

%post

apt update && apt install -y gedit 

apt install -y --no-install-recommends \
    ca-certificates apt-transport-https gnupg curl dirmngr vim cmake

apt update --allow-insecure-repositories

apt update && apt install -y python3-pip

apt install -y llvm

python3 -m pip install --upgrade pip

python3 -m pip install py
python3 -m pip install urllib3
python3 -m pip install pylint
python3 -m pip install wordcloud
python3 -m pip install tornado
python3 -m pip install theano
python3 -m pip install cython
python3 -m pip install dlib
python3 -m pip install h5py
python3 -m pip install html5lib
python3 -m pip install jupyter
python3 -m pip install joblib
python3 -m pip install llvmlite==0.30.0
python3 -m pip install nltk
python3 -m pip install jupyter 
python3 -m pip install notebook 
python3 -m pip install matplotlib
python3 -m pip install numba==0.46.0
python3 -m pip install numpy
python3 -m pip install opencv-python
python3 -m pip install pandas
python3 -m pip install pillow
python3 -m pip install scikit-image
python3 -m pip install scikit-learn
python3 -m pip install scipy
python3 -m pip install seaborn
python3 -m pip install simplegeneric

# # Install pymc3 from source
apt install -y git
cd /root
git clone --branch v3.6 https://github.com/pymc-devs/pymc3/ /root/pymc3

# Note that 
# cd /root/pymc3/ 
# /usr/bin/python3 setup.py install
# is not necessarily equivalent to 
# cd /root/pymc3/ && \
#    /usr/bin/python3 setup.py install
# since the side effect of cd might be lost in subsequent instructions

cd /root/pymc3/ && \
    /usr/bin/python3 setup.py install && \
    cd ../ && \
    rm -Rf pymc3

# R packages

apt install -y apt-transport-https software-properties-common

apt update

export R_REPOS="https://cloud.r-project.org"

add-apt-repository "deb $R_REPOS/bin/linux/ubuntu xenial-cran35/"

apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9

apt update

# To find R versions
# apt-cache policy r-base

export R_VERSION="3.6.3-1xenial"
apt install -y r-base=$R_VERSION
apt install -y libudunits2-dev
apt install -y libgdal-dev

mkdir /root/r_analysis
cd /root/r_analysis
export R_CONTRIBS="https://cloud.r-project.org"
echo 'install.packages("INLA", repos=c("'$R_CONTRIBS'", INLA="https://inla.r-inla-download.org/R/testing"), dep=TRUE)' >> r_install.txt
echo 'install.packages("diseasemapping", repos="'$R_CONTRIBS'")' >> r_install.txt
echo 'install.packages("sp", repos="'$R_CONTRIBS'")' >> r_install.txt
echo 'install.packages("spdep", repos="'$R_CONTRIBS'")' >> r_install.txt
echo 'install.packages("geostatsp", repos="'$R_CONTRIBS'")' >> r_install.txt
echo 'install.packages("mapmisc", repos="'$R_CONTRIBS'")' >> r_install.txt
Rscript r_install.txt

rm -Rf /root/r_analysis

# Install rpy2
apt install -y python3-rpy2=2.9.3-1xenial0

%runscript
python3 /opt/scripts/basic_classification.py


%apprun console
ipython

%apprun jupyter
jupyter notebook --ip 0.0.0.0 --no-browser --allow-root

【问题讨论】:

    标签: python-3.x cpu-usage singularity-container


    【解决方案1】:

    我无法重现这一点,也使用默认的singularity.conf 和奇异版本 3.5.1。差异可能是由于您的主机环境和容器中的 python 环境之间的差异造成的吗?你能分享你的奇点定义文件吗?

    我的计算机有 8 个内核,在容器和主机上运行时,我看到使用率高达 798%。这是我使用的代码和命令。

    奇点

    Bootstrap: docker
    From: continuumio/miniconda3:4.8.2
    
    %post
    /opt/conda/bin/conda install --yes nomkl numba numpy scipy
    
    %runscript
    /opt/conda/bin/python3 "$@"
    
    sudo singularity build conda.sif Singularity
    singularity run conda.sif nb_cpu_singularity.py 30000 100000
    

    主机

    conda create --yes -n sotest nomkl numba numpy scipy
    conda activate sotest
    python nb_cpu_singularity.py 30000 100000
    

    【讨论】:

    • 感谢您尝试复制此内容。我可以重现你的结果。实际上我无法复制我的,对不起。我怀疑我不能,因为我重新启动了系统。但是,我的初始复杂代码仍然存在问题。我尝试了几个奇点图像,似乎问题与对 numba 函数的嵌套调用有关,因为对于一个图像,我得到了一个错误Terminating: Nested parallel kernel launch detected, the workqueue threading layer does not supported nested parallelism. Try the TBB threading layer. Aborted (core dumped)
    • 第二张图片的cpu使用率为100%。在主机中,它是 800%,我无法解释这一点。使用 docker 可以获得相同的行为。我将尝试构建一些重现问题的简单示例。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-03-29
    • 2016-08-12
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多