Configuring a data science environment can be a pain. Dealing with inconsistent package versions, having to dive through obscure error messages, and having to wait hours for packages to compile can be frustrating. This makes it hard to get started with data science in the first place, and is a completely arbitrary barrier to entry.
配置数据科学环境可能很痛苦。 处理不一致的软件包版本,必须深入研究晦涩的错误消息以及必须等待数小时才能编译软件包可能令人沮丧。 首先,这使得很难开始使用数据科学,并且完全是进入的任意障碍。
The past few years have seen the rise of technologies that help with this by creating isolated environments. We’ll be exploring one in particular, Docker. Docker makes it fast and easy to create new data science environments, and use tools such as Jupyter notebooks to explore your data.
在过去的几年中,通过创建隔离的环境来帮助实现这一目标的技术正在兴起。 我们将特别探索Docker 。 Docker使创建新的数据科学环境变得容易快捷,并使用Jupyter笔记本之类的工具来浏览数据。
With Docker, we can download an image file that contains a set of packages and data science tools. We can then boot up a data science environment using this image within seconds, without the need to manually install packages or wait around. This environment is called a Docker container. Containers eliminate configuration problems – when you start a Docker container, it has a known good state, and all the packages work properly.
使用Docker,我们可以下载包含一组软件包和数据科学工具的映像文件。 然后,我们可以在几秒钟内使用该映像启动数据科学环境,而无需手动安装软件包或等待。 该环境称为Docker容器。 容器消除了配置问题–启动Docker容器时,该容器具有已知的良好状态,并且所有软件包均可正常工作。
The Docker whale is here to help
Docker鲸在这里为您提供帮助
翻译自: https://www.pybloggers.com/2015/11/docker-data-science-environment-with-jupyter/