【问题标题】:Increase the size of /dev/shm in Azure ML Studio在 Azure ML Studio 中增加 /dev/shm 的大小
【发布时间】:2016-03-13 12:18:26
【问题描述】:

我正在尝试在 Azure ML Studio 笔记本中执行以下代码:

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.cross_validation import KFold, cross_val_score

for C in np.linspace(0.01, 0.2, 30):
    cv = KFold(n=X_train.shape[0], n_folds=7, shuffle=True, random_state=12345)
    clf = LogisticRegression(C=C, random_state=12345)
    print C, sum(cross_val_score(clf, X_train_scaled, y_train, scoring='roc_auc', cv=cv, n_jobs=2)) / 7.0

我收到了这个错误:

Failed to save <type 'numpy.ndarray'> to .npy file:
Traceback (most recent call last):
  File "/home/nbcommon/env/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 271, in save
    obj, filename = self._write_array(obj, filename)
  File "/home/nbcommon/env/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 231, in _write_array
    self.np.save(filename, array)
  File "/home/nbcommon/env/lib/python2.7/site-packages/numpy/lib/npyio.py", line 491, in save
    pickle_kwargs=pickle_kwargs)
  File "/home/nbcommon/env/lib/python2.7/site-packages/numpy/lib/format.py", line 585, in write_array
    array.tofile(fp)
IOError: 19834920 requested and 8384502 written

---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-29-9740e9942629> in <module>()
      6     cv = KFold(n=X_train.shape[0], n_folds=7, shuffle=True, random_state=12345)
      7     clf = LogisticRegression(C=C, random_state=12345)
----> 8     print C, sum(cross_val_score(clf, X_train_scaled, y_train, scoring='roc_auc', cv=cv, n_jobs=2)) / 7.0

/home/nbcommon/env/lib/python2.7/site-packages/sklearn/cross_validation.pyc in cross_val_score(estimator, X, y, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch)
   1431                                               train, test, verbose, None,
   1432                                               fit_params)
-> 1433                       for train, test in cv)
   1434     return np.array(scores)[:, 0]
   1435 

/home/nbcommon/env/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
    808                 # consumption.
    809                 self._iterating = False
--> 810             self.retrieve()
    811             # Make sure that we get a last message telling us we are done
    812             elapsed_time = time.time() - self._start_time

/home/nbcommon/env/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in retrieve(self)
    725                 job = self._jobs.pop(0)
    726             try:
--> 727                 self._output.extend(job.get())
    728             except tuple(self.exceptions) as exception:
    729                 # Stop dispatching any new job in the async callback thread

/home/nbcommon/env/lib/python2.7/multiprocessing/pool.pyc in get(self, timeout)
    565             return self._value
    566         else:
--> 567             raise self._value
    568 
    569     def _set(self, i, obj):

IOError: [Errno 28] No space left on device

n_jobs=1 可以正常工作。

我认为这是因为joblib 库试图将我的数据保存到/dev/shm。问题是它只有64M容量:

Filesystem         Size  Used Avail Use% Mounted on
none               786G  111G  636G  15% /
tmpfs               56G     0   56G   0% /dev
shm                 64M     0   64M   0% /dev/shm
tmpfs               56G     0   56G   0% /sys/fs/cgroup
/dev/mapper/crypt  786G  111G  636G  15% /etc/hosts

我无法通过设置JOBLIB_TEMP_FOLDER 环境变量来更改此文件夹(export 不起作用)。

In [35]: X_train_scaled.nbytes

Out[35]: 158679360

感谢您的建议!

【问题讨论】:

    标签: python azure scikit-learn joblib azure-machine-learning-studio


    【解决方案1】:

    /dev/shm 是一个虚拟文件系统,用于在 Linux 上实现传统共享内存的程序之间传递数据。

    所以你不能通过在应用程序布局上设置一些选项来增加它。

    但例如,您可以在具有管理员权限的 Linux Shell 中重新挂载 8G 大小的 /dev/shm,如下所示 root

    mount -o remount,size=8G /dev/shm

    不过,Azure ML studio 似乎不支持通过 SSH 协议进行远程访问,所以如果目前使用免费层,可行的方案是升级标准层。

    【讨论】:

    • 我试图在msdn论坛上问这个问题,我认为他们将来会增加/dev/shm的大小。我决定只是复制数据并运行 ThreadPool 而不是 joblib 作为临时解决方案。
    猜你喜欢
    • 1970-01-01
    • 2020-03-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-04-18
    • 2013-06-03
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多