【问题标题】:apache airflow: initdb vs resetdbapache气流:initdb vs resetdb
【发布时间】:2020-01-01 22:11:20
【问题描述】:

“airflow initdb”命令和“airflow resetdb”命令有什么区别?

真的有必要有两个不同的命令吗?

什么时候适合使用一个与另一个?

doc 说...

airflow initdb:初始化元数据库

airflow resetdb:烧毁并重建元数据库

这并不能告诉我太多。

我最好的猜测是

airflow initdb 仅在第一次从airflow.cfg 创建数据库时使用
airflow resetdb 如果对该配置是必需的。

当我运行它们时,不会更改 sqlite 数据库上的时间戳,但 resetdb 似乎更嘈杂。

气流初始化数据库

(.sandbox) [airflow@localhost airflow]$ airflow initdb
[2020-01-01 21:49:21,603] {settings.py:252} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=24917
DB: postgresql+psycopg2://airflow@localhost:5432/airflow_mdb
[2020-01-01 21:49:22,257] {db.py:368} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
Done.

气流重置数据库

(.sandbox) [airflow@localhost airflow]$ airflow resetdb
[2020-01-01 21:49:46,579] {settings.py:252} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=25045
DB: postgresql+psycopg2://airflow@localhost:5432/airflow_mdb
This will drop existing tables if they exist. Proceed? (y/n)y
[2020-01-01 21:49:49,984] {db.py:389} INFO - Dropping tables that exist
[2020-01-01 21:49:50,062] {migration.py:154} INFO - Context impl PostgresqlImpl.
[2020-01-01 21:49:50,063] {migration.py:161} INFO - Will assume transactional DDL.
[2020-01-01 21:49:50,070] {db.py:368} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, current schema
INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 1507a7289a2f, create is_encrypted
INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 13eb55f81627, maintain history for compatibility with earlier migrations
INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 338e90f54d61, More logging into task_instance
INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 52d714495f0, job_id indices
INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 502898887f84, Adding extra to Log
INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 1b38cef5b76e, add dagrun
INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 2e541a1dcfed, task_duration
INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 40e67319e3a9, dagrun_config
INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 561833c1c74b, add password column to user
INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, dagrun start end
INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, Add notification_sent column to sla_miss
INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field in connection
INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 1968acfc09e3, add is_encrypted column to variable table
INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 2e82aab8ef20, rename user table
INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 211e584da130, add TI state index
INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 64de9cddf6c9, add task fails journal table
INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> f2ca10b85618, add dag_stats table
INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 4addfa1236f1, Add fractional seconds to mysql tables
INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> 8504051e801b, xcom dag task indices
INFO  [alembic.runtime.migration] Running upgrade 8504051e801b -> 5e7d17757c7a, add pid field to TaskInstance
INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> 127d2bf2dfa7, Add dag_id/state index on dag_run table
INFO  [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> cc1e65623dc7, add max tries column to task instance
INFO  [alembic.runtime.migration] Running upgrade cc1e65623dc7 -> bdaa763e6c56, Make xcom value column a large binary
INFO  [alembic.runtime.migration] Running upgrade bdaa763e6c56 -> 947454bf1dff, add ti job_id index
INFO  [alembic.runtime.migration] Running upgrade 947454bf1dff -> d2ae31099d61, Increase text size for MySQL (not relevant for other DBs' text types)
INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 0e2a74e0fc9f, Add time zone awareness
INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 33ae817a1ff4, kubernetes_resource_checkpointing
INFO  [alembic.runtime.migration] Running upgrade 33ae817a1ff4 -> 27c6a30d7c24, kubernetes_resource_checkpointing
INFO  [alembic.runtime.migration] Running upgrade 27c6a30d7c24 -> 86770d1215c0, add kubernetes scheduler uniqueness
INFO  [alembic.runtime.migration] Running upgrade 86770d1215c0, 0e2a74e0fc9f -> 05f30312d566, merge heads
INFO  [alembic.runtime.migration] Running upgrade 05f30312d566 -> f23433877c24, fix mysql not null constraint
INFO  [alembic.runtime.migration] Running upgrade f23433877c24 -> 856955da8476, fix sqlite foreign key
INFO  [alembic.runtime.migration] Running upgrade 856955da8476 -> 9635ae0956e7, index-faskfail
INFO  [alembic.runtime.migration] Running upgrade 9635ae0956e7 -> dd25f486b8ea, add idx_log_dag
INFO  [alembic.runtime.migration] Running upgrade dd25f486b8ea -> bf00311e1990, add index to taskinstance
INFO  [alembic.runtime.migration] Running upgrade 9635ae0956e7 -> 0a2a5b66e19d, add task_reschedule table
INFO  [alembic.runtime.migration] Running upgrade 0a2a5b66e19d, bf00311e1990 -> 03bc53e68815, merge_heads_2
INFO  [alembic.runtime.migration] Running upgrade 03bc53e68815 -> 41f5f12752f8, add superuser field
INFO  [alembic.runtime.migration] Running upgrade 41f5f12752f8 -> c8ffec048a3b, add fields to dag
INFO  [alembic.runtime.migration] Running upgrade c8ffec048a3b -> dd4ecb8fbee3, Add schedule interval to dag
INFO  [alembic.runtime.migration] Running upgrade dd4ecb8fbee3 -> 939bb1e647c8, task reschedule fk on cascade delete
INFO  [alembic.runtime.migration] Running upgrade c8ffec048a3b -> a56c9515abdc, Remove dag_stat table
INFO  [alembic.runtime.migration] Running upgrade 939bb1e647c8 -> 6e96a59344a4, Make TaskInstance.pool not nullable
INFO  [alembic.runtime.migration] Running upgrade 6e96a59344a4 -> 74effc47d867, change datetime to datetime2(6) on MSSQL tables
INFO  [alembic.runtime.migration] Running upgrade 939bb1e647c8 -> 004c1210f153, increase queue name size limit
(.sandbox) [airflow@localhost airflow]$ 

当然,您可以将数据库从 sqlite 移动到 postgres。
目前还不清楚哪种情况适合这种情况。
也不清楚 webserver 和 scheduler 是如何知道去哪里寻找配置的?
也许他们先查看airflow.cfg 以找出数据库在哪里,然后再查看数据库?这似乎是多余的。

【问题讨论】:

    标签: airflow


    【解决方案1】:

    db reset 将从元数据数据库中删除所有条目。这包括所有 dag 运行、变量和连接。

    db init 仅在安装气流时运行一次。

    一般来说,我们不太担心 dag 奔跑。但重新创建变量和连接可能会很烦人,因为它们通常包含秘密和敏感数据,出于安全最佳实践的考虑,这些数据可能不会被复制。

    db init 也是幂等的,因此可以随意运行,无需担心数据库更改。

    【讨论】:

    • 谢谢,我想知道它是否是幂等的!
    • 谢谢!我的安装出了点问题,以至于 DAG 没有在 Web 服务器上更新,但作为一种解决方法,如果我运行 $ airflow db init 它会刷新 DAG 信息!知道它不会影响运行历史非常有用
    【解决方案2】:

    上面的解释是对的,但是命令是错误的。 重置的工作命令是 - airflow db reset init (set db) 的工作命令是 - airflow db init

    【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-06-26
    • 2016-12-21
    • 2020-05-05
    • 1970-01-01
    • 2021-04-01
    • 1970-01-01
    • 1970-01-01
    • 2021-12-09
    相关资源
    最近更新 更多