【问题标题】:Apache airflow Bash_operator success while python file failedApache 气流 Bash_operator 成功,而 python 文件失败
【发布时间】:2020-11-25 08:04:03
【问题描述】:

我在 apache 气流 GUI 中有以下日志:

*** Reading local file: /home/ubuntu/airflow/logs/risk_position/delta_risk/2020-11-25T06:38:38.444673+00:00/1.log
[2020-11-25 06:52:40,950] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: risk_position.delta_risk 2020-11-25T06:38:38.444673+00:00 [queued]>
[2020-11-25 06:52:40,964] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: risk_position.delta_risk 2020-11-25T06:38:38.444673+00:00 [queued]>
[2020-11-25 06:52:40,965] {taskinstance.py:880} INFO - 
--------------------------------------------------------------------------------
[2020-11-25 06:52:40,965] {taskinstance.py:881} INFO - Starting attempt 1 of 6
[2020-11-25 06:52:40,965] {taskinstance.py:882} INFO - 
--------------------------------------------------------------------------------
[2020-11-25 06:52:40,974] {taskinstance.py:901} INFO - Executing <Task(BashOperator): delta_risk> on 2020-11-25T06:38:38.444673+00:00
[2020-11-25 06:52:40,977] {standard_task_runner.py:54} INFO - Started process 18650 to run task
[2020-11-25 06:52:41,002] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'risk_position', 'delta_risk', '2020-11-25T06:38:38.444673+00:00', '--job_id', '64', '--pool', 'default_pool', '--raw', '-sd', '/home/ubuntu/.local/lib/python3.6/site-packages/airflow/example_dags/risk_position.py', '--cfg_path', '/tmp/tmp1kqxd_yj']
[2020-11-25 06:52:41,003] {standard_task_runner.py:78} INFO - Job 64: Subtask delta_risk
[2020-11-25 06:52:41,024] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: risk_position.delta_risk 2020-11-25T06:38:38.444673+00:00 [running]> ip-************.ap-northeast-1.compute.internal
[2020-11-25 06:52:41,035] {bash_operator.py:113} INFO - Tmp dir root location: 
 /tmp
[2020-11-25 06:52:41,036] {bash_operator.py:136} INFO - Temporary script location: /tmp/airflowtmpowss08ak/delta_riskhmnyrm0e
[2020-11-25 06:52:41,036] {bash_operator.py:146} INFO - Running command: /home/ubuntu/extra/cronjobs/unify_report.sh delta_risk || exit 1 
[2020-11-25 06:52:41,042] {bash_operator.py:153} INFO - Output:
[2020-11-25 06:52:41,044] {bash_operator.py:157} INFO - delta_risk report
[2020-11-25 06:52:41,046] {bash_operator.py:157} INFO - output  /home/ubuntu/market_risk/delta_risk/output 2020-11-24delta_risk.xlsx
[2020-11-25 06:52:41,046] {bash_operator.py:157} INFO - exists /home/ubuntu/extra/cronjobs/unify_reports
[2020-11-25 06:52:41,971] {bash_operator.py:161} INFO - Command exited with return code 0
[2020-11-25 06:52:41,976] {taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=risk_position, task_id=delta_risk, execution_date=20201125T063838, start_date=20201125T065240, end_date=20201125T065241
[2020-11-25 06:52:45,928] {local_task_job.py:102} INFO - Task exited with return code 0

正如本例所示,bash Running command: /home/ubuntu/extra/cronjobs/unify_report.sh delta_risk || exit 1 中有一个命令运行 delta_risk.py 文件。此相同的 .py 文件会生成回溯错误并且无法正确完成。我已经输入了|| exit 1,希望这也会从 bash 脚本中发送一个错误,这将使该过程在 AIrflow GUI 中失败。但是这个过程在 Airflow 中取得了成功。当出现回溯错误时,我希望它与 python 文件一起失败。

请问怎么可能?

编辑:

下面是运行python文件并创建日志文件的.sh部分:

(
    # run each code to generate files
    cd "${SOURCE_PATH_base}/${SUB_PATH}"
    python3 $REPORT_TYPE.py || exit 1
    # output file link
    dir=$top_dir/output_$REPORT_TYPE
    echo $dir 'output link'
    if [ -e $dir ]; then
        echo 'exists' $dir
    else
        # create link to real path
        echo 'no output link' $dir
        echo 'real output path' ${OUTPATH}
        cd ${TOPDIR}/${JOBDIR}
        trap `ln -s "${OUTPATH}" "output_${REPORT_TYPE}"` 1 2 3 15
        echo 'created output link' $dir
    fi
    # upload to gdrive
    cd "${SOURCE_PATH_base}/${INTERAPI_PATH}"
    python3 teamdrive_control.py Market_Risk $REPORT_TYPE "${OUTPATH}/${OUT}" "${TO}" "${FILENO_FLAG}" "${SUBJECT}"
) 2>&1 | xz -9ec > logs/${JOBDIR}-$(date +%s)_$REPORT_TYPE.log.xz

【问题讨论】:

  • 可能是脚本掩盖了错误。我们看不到它的来源,所以我们不能告诉你如何修复它。一个正确编写的 shell 脚本会将失败传播给它的调用者(一个编写不当的脚本可能只是在结尾处简单地exit 0 而不管之前发生了什么)。
  • @tripleee 感谢您的评论。您是否认为像我在 sh 文件中所做的那样创建日志可以像您提到的那样“屏蔽”?
  • 是的,管道的退出状态将是管道中最后一个命令的退出状态。

标签: bash airflow


【解决方案1】:

确实,您脚本中的管道将屏蔽子外壳内的任何错误。

我猜您使用子shell 只是为了将输出重定向应用于整个命令。一个更好的设计是将代码放在一个函数中,然后在发生致命错误的情况下,该函数还可以执行非本地exit

fun () {
    # run each code to generate files
    cd "${SOURCE_PATH_base}/${SUB_PATH}"
    python3 "$REPORT_TYPE.py" || exit 1
    # output file link
    dir=$top_dir/output_$REPORT_TYPE
    echo "$dir output link"
    if [ -e "$dir" ]; then
        echo "exists $dir"
    else
        # create link to real path
        echo "no output link $dir"
        echo "real output path ${OUTPATH}"
        cd "${TOPDIR}/${JOBDIR}"
        # This is really weird; did you mean to put single quotes?
        trap `ln -s "${OUTPATH}" "output_${REPORT_TYPE}"` 1 2 3 15
        echo "created output link $dir"
    fi
    # upload to gdrive
    cd "${SOURCE_PATH_base}/${INTERAPI_PATH}"
    python3 teamdrive_control.py Market_Risk $REPORT_TYPE "${OUTPATH}/${OUT}" "${TO}" "${FILENO_FLAG}" "${SUBJECT}" || exit
}

fun 2>&1 | xz -9ec > "logs/${JOBDIR}-$(date +%s)_$REPORT_TYPE.log.xz"

trap 看起来真的不对;你的意思是在那里使用单引号吗?当前代码将尝试立即运行ln -s 并以trap 的形式运行其输出

还要注意各种引用修复;您通常引用了不需要引用的所有内容,并留下了绝对需要引用外部引号的内容。或许也可以看看When to wrap quotes around a shell variable;可能还会养成在代码上运行http://shellcheck.net/ 的习惯。

这无关紧要,因为无论如何您都将所有内容重定向到同一个文件,但诊断消息通常应该重定向到标准错误。一个好的做法是包含打印诊断的脚本的名称,这样即使您有脚本调用脚本调用脚本等,您也可以看到它的来源。

echo "$0: values of β will lead to dom" >&2

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2022-12-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-12-23
    • 2018-08-09
    • 1970-01-01
    相关资源
    最近更新 更多