Ooops ...在气流中清除失败的任务状态时的AttributeError

时间:2018-01-16 16:29:51

标签: airflow apache-airflow

我正在尝试清除失败的任务,以便它再次运行。

我通常使用树形视图中的Web GUI进行此操作

tree view showing failed task & clear popup

选择"清除"我被引导到一个错误页面:

error page

此页面上的回溯与尝试使用CLI清除此任务时收到的错误相同:

[u@airflow01 ~]# airflow clear -s 2002-07-29T20:25:00 -t 
coverage_check  gom_modis_aqua_coverage_check 
[2018-01-16 16:21:04,235] {__init__.py:57} INFO - Using executor CeleryExecutor
[2018-01-16 16:21:05,192] {models.py:167} INFO - Filling up the DagBag from /root/airflow/dags
Traceback (most recent call last):
  File "/usr/bin/airflow", line 28, in <module>
    args.func(args)
  File "/usr/lib/python3.4/site-packages/airflow/bin/cli.py", line 612, in clear
    include_upstream=args.upstream,
  File "/usr/lib/python3.4/site-packages/airflow/models.py", line 3173, in sub_dag
    dag = copy.deepcopy(self)
  File "/usr/lib64/python3.4/copy.py", line 166, in deepcopy
    y = copier(memo)
  File "/usr/lib/python3.4/site-packages/airflow/models.py", line 3159, in __deepcopy__
    setattr(result, k, copy.deepcopy(v, memo))
  File "/usr/lib64/python3.4/copy.py", line 155, in deepcopy
    y = copier(x, memo)
  File "/usr/lib64/python3.4/copy.py", line 246, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib64/python3.4/copy.py", line 166, in deepcopy
    y = copier(memo)
  File "/usr/lib/python3.4/site-packages/airflow/models.py", line 2202, in __deepcopy__
    setattr(result, k, copy.deepcopy(v, memo))
  File "/usr/lib64/python3.4/copy.py", line 155, in deepcopy
    y = copier(x, memo)
  File "/usr/lib64/python3.4/copy.py", line 246, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib64/python3.4/copy.py", line 182, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/usr/lib64/python3.4/copy.py", line 309, in _reconstruct
    y.__dict__.update(state)
AttributeError: 'NoneType' object has no attribute 'update'

寻找有关可能导致此问题的想法,我应该采取哪些措施来解决此问题,以及如何在将来避免这种情况。

我能够通过使用&#34;浏览&gt;删除任务记录来解决此问题。任务实例&#34;搜索,但仍然希望探索这个问题,因为我多次看到这个问题。

虽然我的DAG代码变得越来越复杂,但这里摘录的是dag中运算符的定义:

    trigger_granule_dag_id = 'trigger_' + process_pass_dag_name
    coverage_check = BranchPythonOperator(
        task_id='coverage_check',
        python_callable=_coverage_check,
        provide_context=True,
        retries=10,
        retry_delay=timedelta(hours=3),
        queue=QUEUE.PYCMR,
        op_kwargs={
            'roi':region,
            'success_branch_id': trigger_granule_dag_id
        }
    )

可以在github/USF-IMARS/imars_dags浏览完整的源代码。以下是最相关部分的链接:

2 个答案:

答案 0 :(得分:1)

下面是我创建的示例DAG,用于模仿您所面临的错误。

import logging
import os
from datetime import datetime, timedelta

import boto3
from airflow import DAG
from airflow import configuration as conf
from airflow.operators import ShortCircuitOperator, PythonOperator, DummyOperator


def athena_data_validation(**kwargs):
    pass


start_date = datetime.now()

args = {
    'owner': 'airflow',
    'start_date': start_date,
    'depends_on_past': False,
    'wait_for_downstream': False,
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(seconds=30)
}

dag_name = 'data_validation_dag'

schedule_interval = None  

dag = DAG(
    dag_id=dag_name,
    default_args=args,
    schedule_interval=schedule_interval)

athena_client = boto3.client('athena', region_name='us-west-2')

DAG_SCRIPTS_DIR = conf.get('core', 'DAGS_FOLDER') + "/data_validation/"

start_task = DummyOperator(task_id='Start_Task', dag=dag)

end_task = DummyOperator(task_id='End_Task', dag=dag)

data_validation_task = ShortCircuitOperator(
    task_id='Data_Validation',
    provide_context=True,
    python_callable=athena_data_validation,
    op_kwargs={
        'athena_client': athena_client,
        'sql_file': DAG_SCRIPTS_DIR + 'data_validation.sql',
        's3_output_path': 's3://XXX/YYY/'
    },
    dag=dag)
data_validation_task.set_upstream(start_task)
data_validation_task.set_downstream(end_task)

成功运行一次后,我尝试清除Data_Validation任务并得到同样的错误(见下文)。 enter image description here enter image description here

我删除了athena_client对象创建并将其放在athena_data_validation函数中,然后它就可以了。因此,当我们在Airflow UI中执行clear时,它会尝试执行deepcopy并获取之前运行的所有对象。我仍然试图理解为什么它无法获得object类型的副本,但我得到了一个适合我的解决方法。

答案 1 :(得分:1)

在某些操作中,Airflow会深层复制某些对象。不幸的是,有些对象不允许这样做。 boto客户端是一个不能很好地深度复制的东西的好例子,线程对象是另一个,但是具有嵌套引用的大对象(如下面对父任务的引用)也会导致问题。

通常,您不希望在dag代码本身中实例化客户端。也就是说,我不认为这是你的问题。虽然我无法访问pyCMR代码以查看它是否存在问题。