monit状态失败但过程正常

时间:2016-07-09 17:09:58

标签: monit bosh

我正在尝试使用波什释放部署gunicorn。它是随机失败的。有时它在其他时间失败时工作正常。

monit summary

Process 'gunicorn'                  Execution failed
Process 'nginx'                     running
Process 'consul'                    running

monit log是

[UTC Jul  9 11:56:06] error    : 'gunicorn' process is not running
[UTC Jul  9 11:56:06] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:56:06] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:56:11] info     : start service 'consul' on user request
[UTC Jul  9 11:56:11] info     : monit daemon at 1383 awakened
[UTC Jul  9 11:56:11] info     : start service 'nginx' on user request
[UTC Jul  9 11:56:11] info     : monit daemon at 1383 awakened
[UTC Jul  9 11:56:11] info     : start service 'gunicorn' on user request
[UTC Jul  9 11:56:11] info     : monit daemon at 1383 awakened
[UTC Jul  9 11:56:36] error    : 'gunicorn' failed to start
[UTC Jul  9 11:56:36] info     : 'nginx' start: /var/vcap/jobs/nginx/bin/monit_debugger
[UTC Jul  9 11:56:37] info     : 'nginx' start action done
[UTC Jul  9 11:56:37] info     : 'consul' start: /var/vcap/jobs/consul/bin/monit_debugger
[UTC Jul  9 11:56:38] info     : 'consul' start action done
[UTC Jul  9 11:56:38] info     : Awakened by User defined signal 1
[UTC Jul  9 11:56:38] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:57:08] error    : 'gunicorn' failed to start
[UTC Jul  9 11:57:08] info     : 'gunicorn' start action done
[UTC Jul  9 11:57:18] error    : 'gunicorn' process is not running
[UTC Jul  9 11:57:18] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:57:18] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:57:48] error    : 'gunicorn' failed to start
[UTC Jul  9 11:57:58] error    : 'gunicorn' process is not running
[UTC Jul  9 11:57:58] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:57:58] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:58:28] error    : 'gunicorn' failed to start
[UTC Jul  9 11:58:38] error    : 'gunicorn' process is not running
[UTC Jul  9 11:58:38] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:58:38] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:59:08] error    : 'gunicorn' failed to start
[UTC Jul  9 11:59:18] info     : 'gunicorn' process is running with pid 5670

过程也很好 ps -ef

root      5670     1  0 11:59 ?        00:00:02 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application
root      5682  5670  0 11:59 ?        00:00:00 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application
root      5685  5670  0 11:59 ?        00:00:00 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application
root      5686  5670  0 11:59 ?        00:00:00 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application

这是随机发生的

当gunicorn成功时,我得到以下日志

[UTC Jul  8 22:32:31] error    : 'gunicorn' process is not running
[UTC Jul  8 22:32:31] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:32:31] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:32:36] info     : start service 'consul' on user request
[UTC Jul  8 22:32:36] info     : monit daemon at 1375 awakened
[UTC Jul  8 22:32:36] info     : start service 'nginx' on user request
[UTC Jul  8 22:32:36] info     : monit daemon at 1375 awakened
[UTC Jul  8 22:32:36] info     : start service 'gunicorn' on user request
[UTC Jul  8 22:32:36] info     : monit daemon at 1375 awakened
[UTC Jul  8 22:33:01] error    : 'gunicorn' failed to start
[UTC Jul  8 22:33:01] info     : 'nginx' start: /var/vcap/jobs/nginx/bin/monit_debugger
[UTC Jul  8 22:33:02] info     : 'nginx' start action done
[UTC Jul  8 22:33:02] info     : 'consul' start: /var/vcap/jobs/consul/bin/monit_debugger
[UTC Jul  8 22:33:03] info     : 'consul' start action done
[UTC Jul  8 22:33:03] info     : Awakened by User defined signal 1
[UTC Jul  8 22:33:03] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:33:33] error    : 'gunicorn' failed to start
[UTC Jul  8 22:33:33] info     : 'gunicorn' start action done
[UTC Jul  8 22:33:43] error    : 'gunicorn' process is not running
[UTC Jul  8 22:33:43] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:33:43] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:34:13] error    : 'gunicorn' failed to start
[UTC Jul  8 22:34:23] error    : 'gunicorn' process is not running
[UTC Jul  8 22:34:23] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:34:23] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:34:53] error    : 'gunicorn' failed to start
[UTC Jul  8 22:35:03] error    : 'gunicorn' process is not running
[UTC Jul  8 22:35:03] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:35:03] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:35:33] error    : 'gunicorn' failed to start
[UTC Jul  8 22:35:43] error    : 'gunicorn' process is not running
[UTC Jul  8 22:35:43] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:35:43] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:36:13] error    : 'gunicorn' failed to start
[UTC Jul  8 22:36:23] error    : 'gunicorn' process is not running
[UTC Jul  8 22:36:23] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:36:23] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:36:25] info     : 'gunicorn' started
[UTC Jul  8 22:36:35] info     : 'gunicorn' process is running with pid 5780

更新

    check process gunicorn
  with pidfile /var/vcap/sys/run/gunicorn/gunicorn.pid
  start program "/var/vcap/jobs/gunicorn/bin/monit_debugger gunicorn_ctl '/var/vcap/jobs/gunicorn/bin/gunicorn_ctl start'"
  stop program "/var/vcap/jobs/gunicorn/bin/monit_debugger gunicorn_ctl '/var/vcap/jobs/gunicorn/bin/gunicorn_ctl stop'"
  group vcap

1 个答案:

答案 0 :(得分:0)

您可能想要检查您的进程的pid是否存在。 通常它存储在/var/run/文件夹中。 如果缺少pid文件,则应手动杀死&开始这个过程。