elasticsearch服务器状态503,发现失败

时间:2015-01-09 14:49:28

标签: elasticsearch

我正在构建一个节点集群。两个工作正常(它们被加入一个集群),我试图添加第三个(称为eu5),当它启动时,它不加入集群:

[root@eu5:/etc/elasticsearch]# curl eu5:9200
{
  "status" : 503,
  "name" : "eu5",
  "cluster_name" : "security",
  "version" : {
    "number" : "1.4.2",
    "build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
    "build_timestamp" : "2014-12-16T14:11:12Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.2"
  },
  "tagline" : "You Know, for Search"
}

日志提到了发现问题:

[2015-01-09 15:35:23,399][INFO ][node                     ] [eu5] starting ...
[2015-01-09 15:35:23,468][INFO ][transport                ] [eu5] bound_address {inet[/10.81.147.186:9300]}, publish_address {inet[/10.81.147.186:9300]}
[2015-01-09 15:35:23,475][INFO ][discovery                ] [eu5] security/FdjfWCWgT-mQtipLdi9BFA
[2015-01-09 15:35:53,476][WARN ][discovery                ] [eu5] waited for 30s and no initial state was set by the discovery
[2015-01-09 15:35:53,493][INFO ][http                     ] [eu5] bound_address {inet[/10.81.147.186:9200]}, publish_address {inet[/10.81.147.186:9200]}
[2015-01-09 15:35:53,494][INFO ][node                     ] [eu5] started

配置强制单播

cluster.name: security
node.name: eu5
network.host: 10.81.147.186
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast: ["elk.example.com"]

并且提示服务器可以从我想加入的服务器获得:

[root@eu5:/etc/elasticsearch]# curl elk.example.com:9200
{
  "status" : 200,
  "name" : "eu4",
  "cluster_name" : "security",
  "version" : {
    "number" : "1.4.2",
    "build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
    "build_timestamp" : "2014-12-16T14:11:12Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.2"
  },
  "tagline" : "You Know, for Search"
}

从我想加入的服务器

,两种方式都可以使用9200和9300端口
[root@eu5:/etc/elasticsearch]# nmap -p9200,9300 elk.example.com
(...)
PORT     STATE SERVICE
9200/tcp open  wap-wsp
9300/tcp open  vrace

以及从主服务器到该服务器

[root@eu4:/etc/elasticsearch]#  nmap -p9200,9300 eu5.example.com
(...)
PORT     STATE SERVICE
9200/tcp open  wap-wsp
9300/tcp open  vrace

还有什么我应该检查的吗?

更新:在Andrei Stefan的评论后,我切换到DEBUG进行日志记录。我得到了诸如

之类的行
[2015-01-12 11:14:41,609][DEBUG][discovery.zen            ] [eu5] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2015-01-12 11:14:44,615][DEBUG][discovery.zen            ] [eu5] filtered ping responses: (filter_client[true], filter_data[false]) {none}

在发现阶段(30秒后发生超时)。快速浏览the code(我不知道Java)似乎表明{none}意味着ping失败。

我上面做的测试表明,从操作系统的角度来看 ,连接正常。

更新2 :以下是与上述事件相对应的tcpdumpeu5,想加入的机器为10.81.144.186

enter image description here

完整图片:http://i.stack.imgur.com/vLi7r.png

更新3 :我提交了bug report

1 个答案:

答案 0 :(得分:1)

配置中有错误,应该是

discovery.zen.ping.unicast.hosts

hosts失踪