Question

我遇到ActiveMQ的问题，当主Zookeeper节点离线时，整个群集都会失败。

我们在开发环境中设置了3节点ActiveMQ群集。每个节点都有ActiveMQ 5.12.0和Zookeeper 3.4.6（*注意，我们已经使用Zookeeper 3.4.7进行了一些测试，但这未能解决问题。时间限制到目前为止阻止我们测试ActiveMQ 5.13）。 / p>

我们发现当我们停止主ZooKeeper进程（通过任务管理器中的“结束进程树”命令）时，剩下的两个ZooKeeper节点继续正常运行。有时ActiveMQ集群能够处理这个问题，但有时却没有。

当群集出现故障时，我们通常会在ActiveMQ日志中看到这一点：

2015-12-18 09:08:45,157 | WARN  | Too many cluster members are connected.  Expected at most 3 members but there are 4 connected. | org.apache.activemq.leveldb.replicated.MasterElector | WrapperSimpleAppMain-EventThread
...
...
2015-12-18 09:27:09,722 | WARN  | Session 0x351b43b4a560016 for server null, unexpected error, closing socket connection and attempting reconnect | org.apache.zookeeper.ClientCnxn | WrapperSimpleAppMain-SendThread(192.168.0.10:2181)
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.7.0_79]
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)[:1.7.0_79]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)[zookeeper-3.4.6.jar:3.4.6-1569965]

我们立即担心的事实是（A）ActiveMQ似乎认为群集中只有4个成员配置为3时（B）当引发异常时，服务器似乎为空。然后，我们将ActiveMQ的日志记录级别增加到DEBUG，以显示成员列表：

2015-12-18 09:33:04,236 | DEBUG | ZooKeeper group changed: Map(localhost -> ListBuffer((0000000156,{"id":"localhost","container":null,"address":null,"position":-1,"weight":5,"elected":null}), (0000000157,{"id":"localhost","container":null,"address":null,"position":-1,"weight":1,"elected":null}), (0000000158,{"id":"localhost","container":null,"address":"tcp://192.168.0.11:61619","position":-1,"weight":10,"elected":null}), (0000000159,{"id":"localhost","container":null,"address":null,"position":-1,"weight":10,"elected":null}))) | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-14

有人可以建议为什么会发生这种情况和/或建议解决这个问题的方法吗？我们的配置如下所示：

动物园管理员：

tickTime=2000
dataDir=C:\\zookeeper-3.4.7\\data
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.0.10:2888:3888
server.2=192.168.0.11:2888:3888
server.3=192.168.0.12:2888:3888

ActiveMQ（server.1）：

<persistenceAdapter>    
    <replicatedLevelDB
    directory="activemq-data"
    replicas="3"
    bind="tcp://0.0.0.0:61619"
    zkAddress="192.168.0.11:2181,192.168.0.10:2181,192.168.0.12:2181"
    zkPath="/activemq/leveldb-stores"
    hostname="192.168.0.10"
    weight="5"/>
    //server.2 has a weight of 10, server.3 has a weight of 1
</persistenceAdapter>

当Zookeeper主节点脱机时，为什么ActiveMQ群集因“server null”而失败？

0 个答案: