【发布时间】:2022-08-23 23:24:13
【问题描述】:
我正在尝试将一个抓取项目包装在一个 Docker 容器中以在液滴上运行它。蜘蛛抓取一个网站,然后将数据写入一个 postgres 数据库。 postgres 数据库已经由 Digitalocean 运行和管理。
当我在本地运行命令进行测试时,一切都很好:
docker compose up
我可以想象蜘蛛在数据库上的书写。
然后,每次使用脚本推送代码时,我都会使用 github 操作在注册表上构建和推送我的 docker 映像:
name: CI
# 1
# Controls when the workflow will run.
on:
# Triggers the workflow on push events but only for the master branch
push:
branches: [ master ]
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
inputs:
version:
description: \'Image version\'
required: true
#2
env:
REGISTRY: \"registry.digitalocean.com/*****-registery\"
IMAGE_NAME: \"******-scraper\"
POSTGRES_USERNAME: ${{ secrets.POSTGRES_USERNAME }}
POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
POSTGRES_HOSTNAME: ${{ secrets.POSTGRES_HOSTNAME }}
POSTGRES_PORT: ${{ secrets.POSTGRES_PORT }}
POSTGRES_DATABASE: ${{ secrets.POSTGRES_DATABASE }}
SPLASH_URL: ${{ secrets.SPLASH_URL }}
#3
jobs:
build-compose:
name: Build docker-compose
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Insall doctl
uses: digitalocean/action-doctl@v2
with:
token: ${{ secrets.DIGITALOCEAN_ACCESS_TOKEN }}
- name: Login to DO Container Registry with short-lived creds
run: doctl registry login --expiry-seconds 1200
- name: Remove all old images
run: if [ ! -z \"$(doctl registry repository list | grep \"****-scraper\")\" ]; then doctl registry repository delete-manifest ****-scraper $(doctl registry repository list-tags ****-scraper | grep -o \"sha.*\") --force; else echo \"No repository\"; fi
- name: Build compose
run: docker compose -f docker-compose.yaml up -d
- name: Push to Digital Ocean registery
run: docker compose push
deploy:
name: Deploy from registery to droplet
runs-on: ubuntu-latest
needs: build-compose
然后我手动 ssh root@ipv4 到我的 droplet 以安装 docker、docker compose 并使用以下命令从注册表运行映像:
# Login to registry
docker login -u DO_TOKEN -p DO_TOKEN registry.digitalocean.com
# Stop running container
docker stop ****-scraper
# Remove old container
docker rm ****-scraper
# Run a new container from a new image
docker run -d --restart always --name ****-scraper registry.digitalocean.com/****-registery/****-scraper
一旦 python 脚本在 droplet 上启动,我就会遇到错误:
psycopg2.OperationalError:无法连接到服务器:没有这样的文件 或目录服务器是否在本地运行并接受连接 在 Unix 域套接字 \"/var/run/postgresql/.s.PGSQL.5432\" 上?
似乎我做错了什么,到目前为止我找不到解决方法。 我将不胜感激一些帮助解释。
谢谢,
我的 Dockerfile:
# As Scrapy runs on Python, I run the official Python 3 Docker image. FROM python:3.9.7-slim # Set the working directory to /usr/src/app. WORKDIR /usr/src/app # Install libpq-dev for psycopg2 python package RUN apt-get update \\ && apt-get -y install libpq-dev gcc # Copy the file from the local host to the filesystem of the container at the working directory. COPY requirements.txt ./ # Install Scrapy specified in requirements.txt. RUN pip3 install --no-cache-dir -r requirements.txt # Copy the project source code from the local host to the filesystem of the container at the working directory. COPY . . # For Slash EXPOSE 8050 # Run the crawler when the container launches. CMD [ \"python3\", \"./****/launch_spiders.py\" ]我的 docker-compose.yaml
version: \"3\" services: splash: image: scrapinghub/splash restart: always command: --maxrss 2048 --max-timeout 3600 --disable-lua-sandbox --verbosity 1 ports: - \"8050:8050\" launch_spiders: restart: always build: . volumes: - .:/usr/src/app image: registry.digitalocean.com/****-registery/****-scraper depends_on: - splash
-
你如何使用
POSTGRES_*值?你的 postgresql 数据库在哪里? -
@AdrianKrupa 嘿!我的 postgresql 数据库已经由 digitalocean 运行和托管(单独)。
POSTGRES_*是环境变量,因此我可以在我的 python 脚本中检索它们,例如USERNAME = os.environ.get(\'POSTGRES_USERNAME\')。我使用它们通过 psycopg2 将我的数据库连接到我的数据库,如下所示:self.connection = psycopg2.connect(host=HOSTNAME, user=USERNAME, password=PWD, dbname=DBNAME, port=PORT) -
@AdrianKrupa 我试图破坏我的密码以查看是否可能是这样,但如果凭据错误,我会遇到此错误:
\'psycopg2.OperationalError: connection to server at \"*.*.*.*\", port 5432 failed: FATAL: password authentication failed for user \"***\"
标签: python postgresql docker docker-compose psycopg2