Unicorn USR2重启悬挂问题

时间:2014-01-12 21:51:13

标签: ruby-on-rails ruby-on-rails-4 unicorn

尝试使用USR2信号重启Unicorn时,我遇到了这个特殊问题。在干净地重新启动VPS时,我没有遇到向Unicorn发送USR2信号并让它正常重启的问题。然而,如果我再试一次,大约一个小时后,我将留下一个老主人,他们会在新的主人身边停下来阻止新主人开始。然后我被迫杀死了老主人,所以新主人可以开始。如果我重新启动VPS,它会修复它,但一小时后会再次启动问题。我在Rails 4,Ruby 2.0.0上。

unicorn.log

I, [2014-01-07T15:37:37.118523 #19797]  INFO -- : executing ["/srv/rails/current/bin/unicorn", "-c", "/srv/rails/current/config/unicorn.rb", {12=>#<Kgio::UNIXServer:fd 12>}] (in /srv/rails/releases/20140107091945)
I, [2014-01-07T15:37:37.118983 #19797]  INFO -- : forked child re-executing...
I, [2014-01-07T15:37:38.998632 #19797]  INFO -- : inherited addr=/srv/rails/shared/sockets/unicorn.sock fd=12
I, [2014-01-07T15:37:38.999038 #19797]  INFO -- : Refreshing Gem list
I, [2014-01-07T15:37:41.927794 #19967]  INFO -- : Refreshing Gem list
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid=': Already running on PID:19967 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:151:in `start'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
        from /srv/rails/current/bin/unicorn:16:in `load'
        from /srv/rails/current/bin/unicorn:16:in `<main>'
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid=': Already running on PID:21250 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:151:in `start'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
        from /srv/rails/current/bin/unicorn:16:in `load'
        from /srv/rails/current/bin/unicorn:16:in `<main>'
E, [2014-01-07T15:40:46.720131 #20878] ERROR -- : reaped #<Process::Status: pid 21075 exit 1> exec()-ed
E, [2014-01-07T15:40:46.720870 #20878] ERROR -- : master loop error: Already running on PID:21250 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
E, [2014-01-07T15:40:46.723525 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid='
E, [2014-01-07T15:40:46.723671 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:413:in `reap_all_workers'
E, [2014-01-07T15:40:46.723747 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:292:in `join'
E, [2014-01-07T15:40:46.723815 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
E, [2014-01-07T15:40:46.723880 #20878] ERROR -- : /srv/rails/current/bin/unicorn:16:in `load'
E, [2014-01-07T15:40:46.723930 #20878] ERROR -- : /srv/rails/current/bin/unicorn:16:in `<main>'
E, [2014-01-07T15:41:13.704700 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:13.704901 #21250] ERROR -- : retrying in 0.5 seconds (4 tries left)
E, [2014-01-07T15:41:14.205452 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:14.205597 #21250] ERROR -- : retrying in 0.5 seconds (3 tries left)
78.40.124.16, 173.245.49.122 - - [07/Jan/2014 15:41:14] "GET / HTTP/1.0" 200 28697 0.8345
E, [2014-01-07T15:41:14.706179 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:14.706335 #21250] ERROR -- : retrying in 0.5 seconds (2 tries left)
E, [2014-01-07T15:41:15.206834 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:15.206987 #21250] ERROR -- : retrying in 0.5 seconds (1 tries left)
E, [2014-01-07T15:41:15.707431 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:15.707563 #21250] ERROR -- : retrying in 0.5 seconds (0 tries left)
78.40.124.16, 149.154.158.74 - - [07/Jan/2014 15:41:15] "GET / HTTP/1.0" 200 32866 0.4528
E, [2014-01-07T15:41:16.208055 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `initialize': Address already in use - "/srv/rails/shared/sockets/unicorn.sock" (Errno::EADDRINUSE)
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `new'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `bind_listen'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:255:in `listen'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `block in bind_new_listeners!'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `each'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `bind_new_listeners!'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:146:in `start'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
        from /srv/rails/current/bin/unicorn:16:in `load'
        from /srv/rails/current/bin/unicorn:16:in `<main>'

unicorn.rb

deploy_path = "/srv/rails"
RAILS_ENV = ENV['RAILS_ENV'] || "production"

working_directory "#{deploy_path}/current"
pid "#{deploy_path}/shared/pids/unicorn.pid"
stderr_path "#{deploy_path}/shared/log/unicorn.log"

# Listen on a UNIX data socket
listen "#{deploy_path}/shared/sockets/unicorn.sock"
worker_processes 4

# Preload application before forking worker processes
preload_app true

# Restart any workers that haven't responded in 30 seconds
timeout 30

before_fork do |server, worker|
  ##
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
  # immediately start loading up a new version of itself (loaded with a new
  # version of our app). When this new Unicorn is completely loaded
  # it will begin spawning workers. The first worker spawned will check to
  # see if an .oldbin pidfile exists. If so, this means we've just booted up
  # a new Unicorn and need to tell the old one that it can now die. To do so
  # we send it a QUIT.
  #
  # Using this method we get 0 downtime deploys.

  old_pid = "#{server.config[:pid]}.oldbin"

  if File.exists?(old_pid) && server.pid != old_pid
    begin
      Process.kill("QUIT", File.read(old_pid).to_i)
    rescue Errno::ENOENT, Errno::ESRCH => e
      log = File.open(Rails.root.join('log/unicorn.log'), "a")
      log.puts "Error encountered when killing process:\n"
      log.puts "#{e.message}"
      log.close
    end
  end

  # the following is recomended for Rails + "preload_app true"
  # as there's no need for the master process to hold a connection
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
  end
end

after_fork do |server, worker|
  ##
  # Unicorn master loads the app then forks off workers - because of the way
  # Unix forking works, we need to make sure we aren't using any of the parent's
  # sockets, e.g. db connection

  ActiveRecord::Base.establish_connection
  # Redis and Memcached would go here but their connections are established
  # on demand, so the master never opens a socket

  ##
  # Unicorn master is started as root, which is fine, but let's
  # drop the workers to deployer
  begin
    uid, gid = Process.euid, Process.egid
    user, group = 'deployer', 'deployer'
    target_uid = Etc.getpwnam(user).uid
    target_gid = Etc.getgrnam(group).gid
    worker.tmp.chown(target_uid, target_gid)
    if uid != target_uid || gid != target_gid
      Process.initgroups(user, target_gid)
      Process::GID.change_privilege(target_gid)
      Process::UID.change_privilege(target_uid)
    end
  rescue => e
    if RAILS_ENV == 'development'
      STDERR.puts "couldn't change user, oh well"
    else
      raise e
    end
  end
end

deploy.rb

require 'bundler/capistrano'    # runs a bundle install --deployment

# https://github.com/sstephenson/rbenv/issues/101
set :keep_releases, 10
set :shared_children, shared_children + %w(public/images public/uploads)

# Multistage extension
set :stages, ["production", "staging"]
set :default_stage, "staging"
require 'capistrano/ext/multistage'
require 'underglow/capistrano'

# Whenever crontab updates
set :whenever_environment, defer { stage }
set :whenever_command, "bin/whenever"
require 'whenever/capistrano'

set :application, "rails"
set :user, "deployer"

default_run_options[:pty] = true
default_run_options[:shell] = '/bin/zsh'
set :use_sudo, false

# repository
set :repository,      "XXXXXXXXXXXXXXXXX"
set :branch,          fetch(:branch, "master")  # can specify a branch from `cap -S branch="<branch_name>"`
set :scm,             :git
set :scm_verbose,     true

set :ssh_options, forward_agent: true

set :deploy_to,       "/srv/rails"
set :deploy_via,      :remote_cache

# We're using a rbenv user install, setup the PATH we need to access the rbenv shims
set :default_environment, {
  'PATH' => "$HOME/.rbenv/shims:$HOME/.rbenv/bin:$PATH"
}

有没有人见过这个?

2 个答案:

答案 0 :(得分:4)

您应该检查独角兽stdout / stderr日志,以获取更多证据,说明为什么旧的独角兽可能会挂起或新的麒麟未正确杀死它。

一个问题是,如果在部署新版本期间删除了较旧的capistrano版本目录,则在热交换切换期间可能会出现捆绑器错误。人们建议添加以下内容以绑定到Gemfile与特定于发布的路径的永久路径:

before_exec do |server|
  ENV['BUNDLE_GEMFILE'] = "#{deploy_path}/current/Gemfile"
end

如果这是您遇到的问题,您应该看到捆绑器错误或独角兽日志失败。

答案 1 :(得分:3)

这可能对您没有帮助,但这是我为解决问题而采取的措施。

我从Unicorn 4.7.0的发布开始遇到这个问题。在4.7.0中,pid文件的写入行为发生了变化,并破坏了我的重启脚本。旧的4.7.0之前的行为是:将pid文件移动到oldpid,写入新的pid,启动worker,关闭master。最后一步当然是在我的unicorn.rb文件中。新的行为是迅速删除旧的pid,并且只在发生一些繁重的提升之后才写出新的pid。这打破了我的脚本,因为它不能相信事情正确地重新启动。这导致我的sh脚本尝试重新启动它,导致与现在刚刚开始的独角兽进程混淆,并且eh-script开始“全面启动”。两者都以各种方式丢失,因此两人都离开了,留下一位老主人仍在服务请求。

我的unicorn.rb文件中也有一个缺陷,就像有人提到的那样,没有正确设置捆绑包。

升级到最近发布的Unicorn 4.8.1解决了这个问题,因为pid文件是在4.7.0天之前编写的。