客户端和主人

时间:2015-10-01 01:54:27

标签: elasticsearch

简而言之,我有一个独立的ES主实例和一个在我的Java应用程序中创建的客户机节点。如果在客户端节点之前启动独立ES实例,则客户端节点会正确发现独立ES实例。

我面临的问题是 - 如果出于某种原因,客户端节点在独立ES实例之前启动,我会看到“MasterNotDiscoveredException”,这也是预期的。但是,即使在启动独立ES实例后,我仍然看到相同的异常。是否有一些配置我应该改变来解决这个问题?

我正在使用ES 1.7.1进行单播发现。

修改

群集信息:独立ES实例和客户端节点共同构成群集。

客户端节点堆栈跟踪:

11:29:35,634 INFO  http [496648366, id=7BCBFQLCTWOO2, ide=tcp://172.17.78.80:61616] [Squidboy] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.78.80:9200]}
11:29:35,635 INFO  node [496648366, id=7BCBFQLCTWOO2, ide=tcp://172.17.78.80:61616] [Squidboy] started
11:30:10,279 ERROR ApplicationLifeCycle [299961584] System startup not complete after 120 seconds ...
11:30:14,706 WARN  ElasticSearchStatus [278792216] An Exception occurred during cluster health status update - java.util.concurrent.ExecutionException: org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
        at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:292)
        at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:279)
        at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:117)
        at com.harry.elastic.node.ElasticSearchStatus.updateClusterHealth(ElasticSearchStatus.java:90)
        at com.harry.elastic.node.ElasticSearchStatus.access$000(ElasticSearchStatus.java:37)
        at com.harry.elastic.node.ElasticSearchStatus$1.run(ElasticSearchStatus.java:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
        at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$4.onTimeout(TransportMasterNodeOperationAction.java:164)
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:231)
        at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:560)
        ... 3 more

客户端创建代码

private Node createEmbeddedClientNode() {
        ImmutableSettings.Builder settingsBuilder = ImmutableSettings.settingsBuilder()
                .put("discovery.zen.ping.multicast.enabled", false)
                .put("discovery.zen.ping.unicast.hosts", "localhost[9300-9400]");
        return nodeBuilder().settings(settingsBuilder).clusterName("harryService")
                .client(true).data(false).node();
    }

主发现配置

"discovery": {
    "zen": {
      "ping": {
        "multicast": {
          "enabled": false
        }
      }
    }

2 个答案:

答案 0 :(得分:2)

默认情况下,您的客户端节点将每30秒重试一次主节点ping 3次,然后放弃。因此,如果在经过该时间后启动了主节点,则客户端节点将不会发现它。

尝试增加重试和/或超时,这应该会有所帮助。

.put("discovery.zen.fd.ping_timeout", "1m")
.put("discovery.zen.fd.ping_retries", 5)

使用这些设置,您的客户端节点将在5分钟内继续尝试,而不是仅仅1.5分钟。但是,当您启动应用程序时,您的主节点应该已经启动。

可能有帮助的另一个设置如下,因为默认情况下它是真的,你的主人会在主人选中忽略客户端ping,但由于单个主节点可能没有任何区别,所以还是值得一试:

.put("discovery.zen.master_election.filter_client", false)

答案 1 :(得分:1)

我通过在主节点中明确添加单播配置解决了这个问题。

"discovery": {
    "zen": {
      "ping": {
        "multicast": {
          "enabled": false
        },
        "unicast": {
            "hosts": "localhost[9300-9400]"
        }
      }
    }
}