如果主机无法访问,则中止ansible playbook

时间:2014-09-19 09:18:47

标签: ansible ansible-playbook

我想知道是否有任何不错的方法要求所有主机执行一组任务以实际可以访问?

我目前正试图让它来处理一个可能很痛苦的更新,如果它们不是所有相关节点都是同步更新的。

4 个答案:

答案 0 :(得分:5)

当我看到这个问题时,我正准备发一个问题。邓肯建议的答案不起作用,至少在我的情况下。主机无法访问。我的所有剧本都指定max_fail_percentage为0.

但是,ansible将很乐意执行它能够访问并执行操作的主机上的所有任务。我真正想要的是如果任何主机无法访问,请不要执行任何任务。

我发现的是一个简单但可能被认为是hacky的解决方案,并且可以获得更好的答案。

自从第一步作为运行剧本的一部分以来,ansible收集了所有主持人的事实。如果主机无法访问,则无法访问。 我在剧本的最开始写了一个简单的剧本,它将使用一个事实。如果主机无法访问,则任务将因“未定义的变量错误”而失败。该任务只是一个虚拟任务,如果所有主机都可以访问,它将始终通过。

见下面我的例子:

- name: Check Ansible connectivity to all hosts
  hosts: host_all
  user: "{{ remote_user }}"
  sudo: "{{ sudo_required }}"
  sudo_user: root
  connection: ssh # or paramiko
  max_fail_percentage: 0
  tasks:
    - name: check connectivity to hosts (Dummy task)
      shell: echo " {{ hostvars[item]['ansible_hostname'] }}"
      with_items: groups['host_all']
      register: cmd_output

    - name: debug ...
      debug: var=cmd_output

如果主机无法访问,您将收到如下错误:

TASK: [c.. ***************************************************** 
fatal: [172.22.191.160] => One or more undefined variables: 'dict object'    has no attribute 'ansible_hostname' 
fatal: [172.22.191.162] => One or more undefined variables: 'dict object' has no attribute 'ansible_hostname'

FATAL: all hosts have already failed -- aborting

注意:如果您的主机组未被调用host_all,则必须更改虚拟任务以反映该名称。

答案 1 :(得分:3)

您可以将any_errors_fatal: true max_fail_percentage: 0gather_facts: false合并,然后运行一个任务,如果主机是离线。在剧本顶部的这样的东西应该做你需要的:

- hosts: all
  gather_facts: false
  max_fail_percentage: 0
  tasks:
    - action: ping

奖励是,这也适用于限制匹配主机的-l SUBSET选项。

答案 2 :(得分:1)

您可以将max_fail_percentage添加到您的剧本中 - 类似这样:

- hosts: all_boxes
  max_fail_percentage: 0
  roles:
    - common
  pre_tasks:
    - include: roles/common/tasks/start-time.yml
    - include: roles/common/tasks/debug.yml

通过这种方式,您可以决定要承受多少失败。这是relevant section from the Ansible Documentation

  

默认情况下,Ansible将继续执行操作   是组中尚未失败的主机。在某些情况下,   例如,利用上述滚动更新,可能是期望的   在一定的失败阈值出现时中止游戏   到达。要实现此目的,从版本1.3开始,您可以设置最大值   比赛失败率如下:

     
      
  • hosts:webservers max_fail_percentage:30 serial:10在上面的示例中,如果组中的10个服务器中有3个以上   失败了,剩下的比赛将会中止。
  •   
     

注意:必须超出百分比,而不是等于。例如,   如果serial被设置为4并且您希望任务在2的时候中止   系统失败,百分比应设置为49而不是50。

答案 3 :(得分:0)

从其他问题/答案中得到启发。 https://stackoverflow.com/a/55219490/457589

使用ansible-playbook 2.7.8。

对于每个所需的主机来说,检查是否有ansible_facts都是对我来说更明确的

# my-playbook.yml
- hosts: myservers
  tasks:
    - name: Check ALL hosts are reacheable before doing the release
      fail:
        msg: >
          [REQUIRED] ALL hosts to be reachable, so flagging {{ inventory_hostname }} as failed,
          because host {{ item }} has no facts, meaning it is UNREACHABLE.
      when: "hostvars[item].ansible_facts|list|length == 0"
      with_items: "{{ groups.myservers }}"

    - debug:
        msg: "Will only run if all hosts are reacheable"
$ ansible-playbook -i my-inventory.yml my-playbook.yml

PLAY [myservers] *************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************************************
fatal: [my-host-03]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname my-host-03: Name or service not known", "unreachable": true}
fatal: [my-host-04]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname my-host-04: Name or service not known", "unreachable": true}
ok: [my-host-02]
ok: [my-host-01]

TASK [Check ALL hosts are reacheable before doing the release] ********************************************************************************************************************************************************************************************************************
failed: [my-host-01] (item=my-host-03) => {"changed": false, "item": "my-host-03", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-01 as failed, because host my-host-03 has no facts, meaning it is UNREACHABLE."}
failed: [my-host-01] (item=my-host-04) => {"changed": false, "item": "my-host-04", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-01 as failed, because host my-host-04 has no facts, meaning it is UNREACHABLE."}
failed: [my-host-02] (item=my-host-03) => {"changed": false, "item": "my-host-03", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-02 as failed, because host my-host-03 has no facts, meaning it is UNREACHABLE."}
failed: [my-host-02] (item=my-host-04) => {"changed": false, "item": "my-host-04", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-02 as failed, because host my-host-04 has no facts, meaning it is UNREACHABLE."}
skipping: [my-host-01] => (item=my-host-01)
skipping: [my-host-01] => (item=my-host-02)
skipping: [my-host-02] => (item=my-host-01)
skipping: [my-host-02] => (item=my-host-02)
        to retry, use: --limit @./my-playbook.retry

PLAY RECAP *********************************************************************************************************************************************************************************************************************
my-host-01 : ok=1    changed=0    unreachable=0    failed=1
my-host-02 : ok=1    changed=0    unreachable=0    failed=1
my-host-03 : ok=0    changed=0    unreachable=1    failed=0
my-host-04 : ok=0    changed=0    unreachable=1    failed=0