Question

我有一个非常简单的例子，我有一个Actor（SimpleActor），它通过向自己发送消息来执行周期性任务。消息在actor的构造函数中调度。在正常情况下（即没有故障）一切正常。

但是如果演员必须处理错误怎么办？我有另一个演员（SimpleActorWithFault）。这个演员可能有错。在这种情况下，我通过抛出异常来生成一个。当故障发生时（即SimpleActorWithFault抛出异常），它会自动重启。但是，这次重启会扰乱Actor内部的调度程序，该调度程序不再作为例外。如果故障发生得足够快，就会产生更多的意外行为。

我的问题是在这种情况下处理错误的首选方法是什么？我知道我可以使用Try块来处理异常。但是，如果我扩展另一个演员，我不能在超类中放置一个演员，或者某些情况下，当我是一个例外的故障发生在演员身上。

import akka.actor.{Props, ActorLogging}
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import akka.actor.Actor

case object MessageA

case object MessageToSelf


class SimpleActor extends Actor with ActorLogging {

  //schedule a message to self every second
  context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf)

  //keeps track of some internal state
  var count: Int = 0

  def receive: Receive = {
    case MessageA => {
      log.info("[SimpleActor] Got MessageA at %d".format(count))
    }
    case MessageToSelf => {
      //update state and tell the world about its current state 
      count = count + 1
      log.info("[SimpleActor] Got scheduled message at %d".format(count))

    }
  }

}


class SimpleActorWithFault extends Actor with ActorLogging {

  //schedule a message to self every second
  context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf)

  var count: Int = 0

  def receive: Receive = {
    case MessageA => {
      log.info("[SimpleActorWithFault] Got MessageA at %d".format(count))
    }
    case MessageToSelf => {
      count = count + 1
      log.info("[SimpleActorWithFault] Got scheduled message at %d".format(count))

      //at some point generate a fault
      if (count > 5) {
        log.info("[SimpleActorWithFault] Going to throw an exception now %d".format(count))
        throw new Exception("Excepttttttiooooooon")
      }
    }
  }

}


object MainApp extends App {
  implicit val akkaSystem = akka.actor.ActorSystem()
  //Run the Actor without any faults or exceptions 
  akkaSystem.actorOf(Props(classOf[SimpleActor]))

  //comment the above line and uncomment the following to run the actor with faults  
  //akkaSystem.actorOf(Props(classOf[SimpleActorWithFault]))

}

Answer 1

正确的方法是将危险行为压低到自己的演员身上。这种模式称为错误内核模式（参见Akka Concurrency，第8.5节）：

这种模式描述了一种非常常识的监督方法   根据任何不稳定因素区分参与者   声明他们可以持有。

简而言之，这意味着国家珍贵的演员不应该   被允许失败或重启。拥有珍贵数据的任何演员都是   保护，以便任何危险的操作降级为奴隶   演员，如果重新开始，只会导致好事发生。

错误内核模式意味着进一步降低风险等级   树。

另见tutorial here。

所以在你的情况下，它会是这样的：

SimpleActor 
 |- ActorWithFault

此处SimpleActor充当ActorWithFault的主管。任何 actor的默认监督策略是重新启动Exception上的子项并升级其他任何内容： http://doc.akka.io/docs/akka/snapshot/scala/fault-tolerance.html

升级意味着演员本身可能会重新启动。由于您确实不想重新启动SimpleActor，因此可以使其始终重新启动ActorWithFault，并且不会通过覆盖主管策略来升级：

class SimpleActor {
  override def preStart(){
    // our faulty actor --- we will supervise it from now on
    context.actorOf(Props[ActorWithFault], "FaultyActor") 
  ...

  override val supervisorStrategy = OneForOneStrategy () {
    case _: ActorKilledException => Escalate
    case _: ActorInitializationException => Escalate
    case _ => Restart // keep restarting faulty actor
  }

}

Answer 2

为避免弄乱调度程序：

class SimpleActor extends Actor with ActorLogging {

  private var cancellable: Option[Cancellable] = None

  override def preStart() = {
    //schedule a message to self every second
    cancellable = Option(context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf))
  }

  override def postStop() = {
    cancellable.foreach(_.cancel())
    cancellable = None
  }
...

正确处理异常（akka.actor.Status.Failure用于在发件人使用Ask模式时正确回答问题）：

...
def receive: Receive = {
    case MessageA => {
      try {
        log.info("[SimpleActor] Got MessageA at %d".format(count))
      } catch {
        case e: Exception =>
          sender ! akka.actor.Status.Failure(e)
          log.error(e.getMessage, e)
      }
    }
...

处理Akka演员的错误

2 个答案: