如何在Airflow中将MySqlOperator与xcom一起使用?

时间:2018-10-04 11:46:54

标签: airflow

我读了这个How to use airflow xcoms with MySqlOperator,虽然它的名称很像,但并不能真正解决我的问题。

我有以下代码:

def branch_func_is_new_records(**kwargs):
    ti = kwargs['ti']
    xcom = ti.xcom_pull(task_ids='query_get_max_order_id')
    string_to_print = 'Value in xcom is: {}'.format(xcom)
    logging.info(string_to_print)
    if int(xcom) > int(LAST_IMPORTED_ORDER_ID)
        return 'import_orders'
    else:
        return 'skip_operation'

query_get_max_order_id  = 'SELECT COALESCE(max(orders_id),0) FROM warehouse.orders where orders_id>1 limit 10'
get_max_order_id = MySqlOperator(
        task_id='query_get_max_order_id',
        sql= query_get_max_order_id,
        mysql_conn_id=MyCon,
        xcom_push=True,
        dag=dag)

branch_op_is_new_records = BranchPythonOperator(
    task_id='branch_operation_is_new_records',
    provide_context=True,
    python_callable=branch_func_is_new_records,
    dag=dag)

get_max_order_id >> branch_op_is_new_records >> import_orders
branch_op_is_new_records >> skip_operation

MySqlOperator根据BranchPythonOperator选择下一个任务的编号返回一个数字。保证MySqlOperator返回的值大于0

我的问题是XCOM没有将任何东西推到MySqlOperator 当我转到XCOM时,在用户界面上什么也看不到。 BranchPythonOperator以前什么都不读,所以我的代码失败了。

为什么XCOM在这里不起作用?

1 个答案:

答案 0 :(得分:3)

当前,MySQL运算符(在编写本文时为airflow 1.10.0)不支持在XCom中返回任何内容,因此目前的解决方法是自己编写一个小运算符。您可以直接在DAG文件中进行此操作(未经测试,因此可能存在一些愚蠢的错误):

from airflow.operators.mysql_operator import MySqlOperator as BaseMySqlOperator
from airflow.hooks.mysql_hook import MySqlHook

class ReturningMySqlOperator(BaseMySqlOperator):
    def execute(self, context):
        self.log.info('Executing: %s', self.sql)
        hook = MySqlHook(mysql_conn_id=self.mysql_conn_id,
                         schema=self.database)
        return hook.get_first(
            self.sql,
            parameters=self.parameters)


def branch_func_is_new_records(**kwargs):
    ti = kwargs['ti']
    xcom = ti.xcom_pull(task_ids='query_get_max_order_id')
    string_to_print = 'Value in xcom is: {}'.format(xcom)
    logging.info(string_to_print)
    if str(xcom) == 'NewRecords':
        return 'import_orders'
    else:
        return 'skip_operation'

query_get_max_order_id  = 'SELECT COALESCE(max(orders_id),0) FROM warehouse.orders where orders_id>1 limit 10'
get_max_order_id = ReturningMySqlOperator(
        task_id='query_get_max_order_id',
        sql= query_get_max_order_id,
        mysql_conn_id=MyCon,
        # xcom_push=True,
        dag=dag)

branch_op_is_new_records = BranchPythonOperator(
    task_id='branch_operation_is_new_records',
    provide_context=True,
    python_callable=branch_func_is_new_records,
    dag=dag)

get_max_order_id >> branch_op_is_new_records >> import_orders
branch_op_is_new_records >> skip_operation