气流:由于触发规则ALL_DONE,如果一项任务失败,则DAG标记为“成功”

时间:2018-08-07 13:49:57

标签: python-2.7 airflow

我有以下3个任务的DAG:

start --> special_task --> end

中间的任务可以成功或失败,但是end 必须总是被执行(想象这是一个清理资源的任务)。为此,我使用了trigger rule ALL_DONE

end.trigger_rule = trigger_rule.TriggerRule.ALL_DONE

如果end失败,则special_task将被正确执行。但是,由于end是最后一个任务并且成功完成,因此DAG始终标记为SUCCESS

如何配置DAG,以便如果其中一项任务失败,则整个DAG都标记为FAILED

要复制的示例

import datetime

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.utils import trigger_rule

dag = DAG(
    dag_id='my_dag',
    start_date=datetime.datetime.today(),
    schedule_interval=None
)

start = BashOperator(
    task_id='start',
    bash_command='echo start',
    dag=dag
)

special_task = BashOperator(
    task_id='special_task',
    bash_command='exit 1', # force failure
    dag=dag
)

end = BashOperator(
    task_id='end',
    bash_command='echo end',
    dag=dag
)
end.trigger_rule = trigger_rule.TriggerRule.ALL_DONE

start.set_downstream(special_task)
special_task.set_downstream(end)

This post似乎相关,但是答案不符合我的需求,因为必须执行下游任务end(因此必须执行trigger_rule)。

3 个答案:

答案 0 :(得分:1)

@JustinasMarozascomment中所述,一种解决方案是创建一个虚拟任务,如:

dummy = DummyOperator(
    task_id='test',
    dag=dag
)

并将其下游绑定到special_task

failing_task.set_downstream(dummy)

因此,DAG被标记为失败,并且dummy任务被标记为upstream_failed

希望有一个现成的解决方案,但是请耐心等待,此解决方案可以完成工作。

答案 1 :(得分:1)

我认为这是一个有趣的问题,花了一些时间弄清楚如何在没有额外的虚拟任务的情况下实现它。这有点多余,但这是最终结果:

这是完整的DAG:

import airflow
from airflow import AirflowException
from airflow.models import DAG, TaskInstance, BaseOperator
from airflow.operators.bash_operator import BashOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.utils.db import provide_session
from airflow.utils.state import State
from airflow.utils.trigger_rule import TriggerRule

default_args = {"owner": "airflow", "start_date": airflow.utils.dates.days_ago(3)}

dag = DAG(
    dag_id="finally_task_set_end_state",
    default_args=default_args,
    schedule_interval="0 0 * * *",
    description="Answer for question https://stackoverflow.com/questions/51728441",
)

start = BashOperator(task_id="start", bash_command="echo start", dag=dag)
failing_task = BashOperator(task_id="failing_task", bash_command="exit 1", dag=dag)


@provide_session
def _finally(task, execution_date, dag, session=None, **_):
    upstream_task_instances = (
        session.query(TaskInstance)
        .filter(
            TaskInstance.dag_id == dag.dag_id,
            TaskInstance.execution_date == execution_date,
            TaskInstance.task_id.in_(task.upstream_task_ids),
        )
        .all()
    )
    upstream_states = [ti.state for ti in upstream_task_instances]
    fail_this_task = State.FAILED in upstream_states

    print("Do logic here...")

    if fail_this_task:
        raise AirflowException("Failing task because one or more upstream tasks failed.")


finally_ = PythonOperator(
    task_id="finally",
    python_callable=_finally,
    trigger_rule=TriggerRule.ALL_DONE,
    provide_context=True,
    dag=dag,
)

succesful_task = DummyOperator(task_id="succesful_task", dag=dag)

start >> [failing_task, succesful_task] >> finally_

查看_finally函数,该函数由PythonOperator调用。这里有一些关键点:

  1. 使用@provide_session进行注释并添加参数session=None,以便您可以使用session查询Airflow数据库。
  2. 查询所有上游任务实例以获取当前任务:
upstream_task_instances = (
    session.query(TaskInstance)
    .filter(
        TaskInstance.dag_id == dag.dag_id,
        TaskInstance.execution_date == execution_date,
        TaskInstance.task_id.in_(task.upstream_task_ids),
    )
    .all()
)
  1. 从返回的任务实例中获取状态并检查State.FAILED是否在其中:
upstream_states = [ti.state for ti in upstream_task_instances]
fail_this_task = State.FAILED in upstream_states
  1. 执行自己的逻辑:
print("Do logic here...")
  1. 最后,如果fail_this_task=True,则使任务失败:
if fail_this_task:
    raise AirflowException("Failing task because one or more upstream tasks failed.")

最终结果:

enter image description here

答案 2 :(得分:0)

要扩展Bas Harenslak答案,可以使用一个更简单的_finally函数来检查所有任务(不仅是上游任务)的状态:

def _finally(**kwargs):
    for task_instance in kwargs['dag_run'].get_task_instances():
        if task_instance.current_state() != State.SUCCESS and \
                task_instance.task_id != kwargs['task_instance'].task_id:
            raise Exception("Task {} failed. Failing this DAG run".format(task_instance.task_id))