Zookeeper连接超时问题

时间:2016-02-04 12:40:54

标签: apache apache-zookeeper apache-curator

我们正在使用2.3.0版本的curator-framework连接到pom文件中的zookeeper。

 <dependency>
        <groupId>org.apache.curator</groupId>
        <artifactId>curator-framework</artifactId>
        <version>2.3.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.curator</groupId>
        <artifactId>curator-recipes</artifactId>
        <version>2.3.0</version>
    </dependency>

这用于高负荷的服务器,它将数据泵送到kafka很多次,我们偶尔会出现以下错误。 我尝试使用谷歌搜索,但无法找到问题的确切原因和解决方案。寻找有关如何解决此问题的想法。

ERROR org.apache.curator.ConnectionState: Connection timed out for connection string (xxx.xx.xx.xx:2181, yy.yy.y.y:2181) and timeout (15000) / elapsed (37893)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
        at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:191)
        at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:86)
        at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:113)
        at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:456)
        at org.apache.curator.framework.imps.BackgroundSyncImpl.performBackgroundOperation(BackgroundSyncImpl.java:40)
        at org.apache.curator.framework.imps.OperationAndData.callPerformBackgroundOperation(OperationAndData.java:65)
        at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:672)
        at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:664)
        at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:55)
        at org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:243)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:0)

您可能需要为策展人客户端调整会话和连接超时。检查connectionTimeoutMs的{​​{1}}和SessionTimeoutMs设置。

根据经验,您的CuratorFramework应该connectionTimeoutMs由集群中Zookeeper节点的数量划分。如果客户端无法使用SessionTimeoutMs连接到其中一个节点,它将尝试连接到另一个节点,直到会话超时。

要注意的其他设置是重试策略和重试之间的间隔(如果您使用connectionTimeoutMs策略,则值得检查您的RetryNTimes不太大,并且在{{ 1}}时间间隔。