问题一:两台debian-8机器,debian-phy作为manager node, debian-vm作为worker node. 部署swarm时,debian-vm加入集群失败,提示CA证书问题。
解决办法:
1. manager node时间比worker node时间快30分钟,修改时间一致后,worker node加入集群依然提示失败; 2. 重启manager node上docker服务,worker node加入集群成功。
问题二: 创建多replicas的服务时,manager node上服务运行正常,worker node上服务运行失败,提示:
Error: Failed to find a load balancer IP to use for network
解决办法: 发现manager node上docker version 为: docker-ce-17.04.1, worker node上docker version为 docker-ce-18.06.1,将manager node上docker升级到docker-ce-18.06.1后问题解决.
问题三: SUSE12SP2上安装docker-17.04,docker swarm部署时发现两个问题, 描述如下, 暂未解决(debian上验证无问题)。
1. 创建service时, --publish 参数暴露的端口不能被物理机网络访问,但可以通过容器和物理机间的172网段访问; 2. 创建多个不同的service,在跨节点机器上,服务发现不可用(ping server 不通), 但是通过IP地址可以互相访问。
节点版本信息如下:
1 linux # cat /etc/SuSE-release 2 SUSE Linux Enterprise Server 12 (x86_64) 3 VERSION = 12 4 PATCHLEVEL = 2 5 # This file is deprecated and will be removed in a future service pack or release. 6 # Please check /etc/os-release for details about this release. 7 linux # 8 9 10 linux # docker version 11 Client: 12 Version: 17.04.0-ce 13 API version: 1.28 14 Go version: go1.7.5 15 Git commit: 78d1802 16 Built: Tue May 30 18:21:18 2017 17 OS/Arch: linux/amd64 18 19 Server: 20 Version: 17.04.0-ce 21 API version: 1.28 (minimum version 1.12) 22 Go version: go1.7.5 23 Git commit: 78d1802 24 Built: Tue May 30 18:21:18 2017 25 OS/Arch: linux/amd64 26 Experimental: false 27 linux:/app/original/worker # 28 29 30 linux # docker service create --replicas 1 --name server -e APP_PORT=5000 --network docker-net --publish 5000:5000 env/server:v0.1
日志记录:
1 linux:/app/env # docker swarm init --advertise-addr 10.9.23.241 --listen-addr 10.9.23.241 2 Swarm initialized: current node (pcrsf5o2corbm6ol3dlmgtjtt) is now a manager. 3 4 To add a worker to this swarm, run the following command: 5 6 docker swarm join \ 7 --token SWMTKN-1-25isp458n3vftu7cj3p9gul68pe291hn58ekswq9ox8m52e6x9-5dk5aw452oe7ismwekz942xaq \ 8 10.9.23.241:2377 9 10 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions. 11 12 linux:/app/env # docker node ls 13 ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 14 nsf5u8othzfaw26121v06a44e SUSE12-29161 Ready Active 15 pcrsf5o2corbm6ol3dlmgtjtt * linux Ready Active Leader 16 linux:/app/env # docker network ls 17 NETWORK ID NAME DRIVER SCOPE 18 6aa823f9ce29 bridge bridge local 19 0d741bdf766c docker_gwbridge bridge local 20 c9767a06fa1c host host local 21 pyx9mde4js3o ingress overlay swarm 22 393c7ca2630e none null local 23 linux:/app/env # docker service create --replicas 1 --name server -e APP_PORT=5000 --network docker-net --publish 5000:5000 env/server:v0.1 24 image env/server:v0.1 could not be accessed on a registry to record 25 its digest. Each node will access env/server:v0.1 independently, 26 possibly leading to different nodes running different 27 versions of the image. 28 29 tqhnrmfln4v8m5z858py9f8gv 30 linux:/app/env # docker service ls 31 ID NAME MODE REPLICAS IMAGE 32 tqhnrmfln4v8 server replicated 1/1 env/server:v0.1 33 linux:/app/env # docker service ps server 34 ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS 35 5fa4bkyjfitt server.1 env/server:v0.1 linux Running Running 34 seconds ago 36 linux:/app/env # docker ps -a 37 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 38 c00de9e5d3d1 env/server:v0.1 "python server.py" 39 seconds ago Up 37 seconds server.1.5fa4bkyjfittr95stqk0fcjcy 39 linux:/app/env #