气流-如果任一个失败,则运行任务

时间:2019-09-27 14:45:24

标签: airflow

我希望有一个问题,如果任何任务出错,它应该自动执行任务以“重置”表并结束过程。示例:

#Task that needs to be performed if any of the above fails
drop_bq_op = BigQueryOperator(
    task_id='drop_bq',
    use_legacy_sql=False,
    allow_large_results=True,
    bql="""DELETE FROM dataset.table1 WHERE ID IS NOT NULL""",
    bigquery_conn_id='gcp',
    dag=dag)

#task1
MsSql = MsSqlToGoogleCloudStorageOperator(
    task_id='import',
    mssql_conn_id=mssql,
    google_cloud_storage_conn_id='gcp',
    sql=sql_query,
    bucket=nm_bucket,
    filename=nm_arquivo,
    schema_filename=sc_arquivo,
    dag=dag)

#task2
Google = GoogleCloudStorageToBigQueryOperator(
    task_id='gcs_to_bq',
    bucket='bucket',
    source_objects=[nm_arquivo],
    destination_project_dataset_table=dataset_bq_tbl,
    schema_fields=sc_tbl_bq,
    source_format='NEWLINE_DELIMITED_JSON',
    create_disposition='CREATE_IF_NEEDED',
    write_disposition=wrt_disposition,
    time_partitioning=tp_particao,
    cluster_fields=nm_cluster,
    bigquery_conn_id='gcp',
    google_cloud_storage_conn_id='gcp',
    dag=dag
)

task_3 = BigQueryOperator(
    task_id='test3',
    use_legacy_sql=False,
    allow_large_results=True,
    bql="""select ...""",
    bigquery_conn_id='gcp',
    dag=dag)

更新:我在脚本中包含以下代码:

def delete_bigquery():
    """query bigquery to get data to import to PSQL"""
    client = bigquery.Client()
    query = "DELETE FROM dataset.table1 WHERE ID IS NOT NULL"
    dataset = client.dataset('dataset')
    table = dataset.table(name='table1')
    job_name = 'delete_{}'.format(uuid.uuid4())
    job = client.run_async_query(job_name, query)
    job.destination = table
    job.write_disposition = 'WRITE_TRUNCATE'
    job.begin()
    return job.state

cleanup_task = PythonOperator(task_id="cleanup_task",
                              python_callable=delete_bigquery,
                              trigger_rule=TriggerRule.ONE_FAILED,
                              dag=dag)

[gcs_to_bq.set_upstream(import), task_3.set_upstream(gcs_to_bq)] >> cleanup_task

现在,当我再次向上拖动时会出现此错误:

  

残破的DAG:[dag.py]只能在运营商之间设置关系;   收到NoneType

1 个答案:

答案 0 :(得分:0)

# refer code here https://github.com/apache/airflow/blob/master/airflow/utils/trigger_rule.py#L28
from airflow.utils.trigger_rule import TriggerRule
..
cleanup_task = PythonOperator(dag_id="..",
                              task_id="cleanup_task"
                              ..
                              trigger_rule=TriggerRule.ONE_FAILED
                              ..)
..
# all tasks that must be cleaned-up should have `cleanup_task` in their downstream
[my_task_1, my_task_2, my_task_3] >> cleanup_task