处理Akka演员的错误

时间:2014-04-16 12:19:38

标签: scala akka fault-tolerance error-kernel

我有一个非常简单的例子,我有一个Actor(SimpleActor),它通过向自己发送消息来执行周期性任务。消息在actor的构造函数中调度。在正常情况下(即没有故障)一切正常。

但是如果演员必须处理错误怎么办?我有另一个演员(SimpleActorWithFault)。这个演员可能有错。在这种情况下,我通过抛出异常来生成一个。当故障发生时(即SimpleActorWithFault抛出异常),它会自动重启。但是,这次重启会扰乱Actor内部的调度程序,该调度程序不再作为例外。如果故障发生得足够快,就会产生更多的意外行为。

我的问题是在这种情况下处理错误的首选方法是什么?我知道我可以使用Try块来处理异常。但是,如果我扩展另一个演员,我不能在超类中放置一个演员,或者某些情况下,当我是一个例外的故障发生在演员身上。

import akka.actor.{Props, ActorLogging}
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import akka.actor.Actor

case object MessageA

case object MessageToSelf


class SimpleActor extends Actor with ActorLogging {

  //schedule a message to self every second
  context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf)

  //keeps track of some internal state
  var count: Int = 0

  def receive: Receive = {
    case MessageA => {
      log.info("[SimpleActor] Got MessageA at %d".format(count))
    }
    case MessageToSelf => {
      //update state and tell the world about its current state 
      count = count + 1
      log.info("[SimpleActor] Got scheduled message at %d".format(count))

    }
  }

}


class SimpleActorWithFault extends Actor with ActorLogging {

  //schedule a message to self every second
  context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf)

  var count: Int = 0

  def receive: Receive = {
    case MessageA => {
      log.info("[SimpleActorWithFault] Got MessageA at %d".format(count))
    }
    case MessageToSelf => {
      count = count + 1
      log.info("[SimpleActorWithFault] Got scheduled message at %d".format(count))

      //at some point generate a fault
      if (count > 5) {
        log.info("[SimpleActorWithFault] Going to throw an exception now %d".format(count))
        throw new Exception("Excepttttttiooooooon")
      }
    }
  }

}


object MainApp extends App {
  implicit val akkaSystem = akka.actor.ActorSystem()
  //Run the Actor without any faults or exceptions 
  akkaSystem.actorOf(Props(classOf[SimpleActor]))

  //comment the above line and uncomment the following to run the actor with faults  
  //akkaSystem.actorOf(Props(classOf[SimpleActorWithFault]))

}

2 个答案:

答案 0 :(得分:8)

正确的方法是将危险行为压低到自己的演员身上。这种模式称为错误内核模式(参见Akka Concurrency,第8.5节):

  

这种模式描述了一种非常常识的监督方法   根据任何不稳定因素区分参与者   声明他们可以持有。

     

简而言之,这意味着国家珍贵的演员不应该   被允许失败或重启。拥有珍贵数据的任何演员都是   保护,以便任何危险的操作降级为奴隶   演员,如果重新开始,只会导致好事发生。

     

错误内核模式意味着进一步降低风险等级   树。

另见tutorial here

所以在你的情况下,它会是这样的:

SimpleActor 
 |- ActorWithFault

此处SimpleActor充当ActorWithFault主管任何 actor的默认监督策略是重新启动Exception上的子项并升级其他任何内容: http://doc.akka.io/docs/akka/snapshot/scala/fault-tolerance.html

升级意味着演员本身可能会重新启动。由于您确实不想重新启动SimpleActor,因此可以使其始终重新启动ActorWithFault,并且不会通过覆盖主管策略来升级:

class SimpleActor {
  override def preStart(){
    // our faulty actor --- we will supervise it from now on
    context.actorOf(Props[ActorWithFault], "FaultyActor") 
  ...

  override val supervisorStrategy = OneForOneStrategy () {
    case _: ActorKilledException => Escalate
    case _: ActorInitializationException => Escalate
    case _ => Restart // keep restarting faulty actor
  }

}

答案 1 :(得分:3)

为避免弄乱调度程序:

class SimpleActor extends Actor with ActorLogging {

  private var cancellable: Option[Cancellable] = None

  override def preStart() = {
    //schedule a message to self every second
    cancellable = Option(context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf))
  }

  override def postStop() = {
    cancellable.foreach(_.cancel())
    cancellable = None
  }
...

正确处理异常(akka.actor.Status.Failure用于在发件人使用Ask模式时正确回答问题):

...
def receive: Receive = {
    case MessageA => {
      try {
        log.info("[SimpleActor] Got MessageA at %d".format(count))
      } catch {
        case e: Exception =>
          sender ! akka.actor.Status.Failure(e)
          log.error(e.getMessage, e)
      }
    }
...
相关问题