为什么我无法连接到Gremlin-Server?

时间:2017-01-21 19:52:53

标签: titan gremlin-server

摘要

我试图在Docker(v1.13.0)中设置Titan / Cassandra / Gremlin-Server堆栈。我面临的问题是,尝试连接到默认端口8182上的Gremlin-Server的应用程序报告错误(详情如下)。

首先,这是一些相关的版本信息:

  • Cassandra v2.2.8
  • Titan v1.0.0(Hadoop 1)
  • Gremlin 3.2.3

设置

设置在Dockerfile中进行,以便可重现。它假定Cassandra容器已存在,并运行cassandra.yaml,其中start_rpc已设置为true

Dockerfile如下:

FROM openjdk:alpine

ENV TITAN 'titan-1.0.0-hadoop1'

RUN apk update && apk add bash unzip && rm -rf /var/cache/apk/* \
    && adduser -S -s /bin/bash -D srg \
    && wget -O /tmp/$TITAN.zip http://s3.thinkaurelius.com/downloads/titan/$TITAN.zip \
    && unzip /tmp/$TITAN.zip -d /opt && ln -s /opt/$TITAN /opt/titan \
    && rm /tmp/*.zip \
    && chown -R srg /opt/$TITAN/ \
    && /opt/titan/bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3

COPY conf/gremlin-server/* /opt/$TITAN/conf/gremlin-server/

USER srg
WORKDIR /opt/titan
EXPOSE 8182

CMD ["bin/gremlin-server.sh", "conf/gremlin-server/srg.yaml"]

精明的读者会注意到我正在将自定义配置文件复制到容器中,即Gremlin-Server配置文件(srg.yaml)和titan图表属性文件(srg.properties)。

srg.yaml

host: localhost
port: 8182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/gremlin-server/srg.properties
  }
plugins:
  - aurelius.titan
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]},
  gremlin-jython: {},
  gremlin-python: {},
  nashorn: {
      imports: [java.lang.Math],
      staticImports: [java.lang.Math.PI]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}

srg.properties

gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=cassandrathrift
storage.hostname=cassandra  # refers to the linked container
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25

# Start elasticsearch inside the Titan JVM
index.search.backend=elasticsearch
index.search.directory=db/es
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true

执行

使用以下命令运行容器:docker run -ti --rm=true --link test.cassandra:cassandra -p 8182:8182 titan

以下是Gremlin-Server的日志输出:

0    [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - 
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----

297  [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Configuring Gremlin Server from conf/gremlin-server/srg.yaml
439  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics ConsoleReporter configured with report interval=180000ms
448  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics CsvReporter configured with report interval=180000ms to fileName=/tmp/gremlin-server-metrics.csv
557  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics JmxReporter configured with domain= and agentId=
561  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
1750 [main] INFO  com.thinkaurelius.titan.core.util.ReflectiveConfigOptionLoader  - Loaded and initialized config classes: 12 OK out of 12 attempts in PT0.148S
1972 [main] INFO  com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager  - Closed Thrift connection pooler.
1990 [main] INFO  com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration  - Generated unique-instance-id=ac1100031-ad2d5ffa52e81
2026 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Configuring index [search]
2386 [main] INFO  org.elasticsearch.node  - [Lunatik] version[1.5.1], pid[1], build[5e38401/2015-04-09T13:41:35Z]
2387 [main] INFO  org.elasticsearch.node  - [Lunatik] initializing ...
2399 [main] INFO  org.elasticsearch.plugins  - [Lunatik] loaded [], sites []
6471 [main] INFO  org.elasticsearch.node  - [Lunatik] initialized
6472 [main] INFO  org.elasticsearch.node  - [Lunatik] starting ...
6477 [main] INFO  org.elasticsearch.transport  - [Lunatik] bound_address {local[1]}, publish_address {local[1]}
6507 [main] INFO  org.elasticsearch.discovery  - [Lunatik] elasticsearch/u2StmRW1RsyEHw561yoNFw
6519 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.cluster.service  - [Lunatik] master {new [Lunatik][u2StmRW1RsyEHw561yoNFw][ad2d5ffa52e8][local[1]]{local=true}}, removed {[Lunatik][kKyL9UE-R123LLZTTrsVCw][ad2d5ffa52e8][local[1]]{local=true},}, reason: local-disco-initial_connect(master)
6908 [main] INFO  org.elasticsearch.http  - [Lunatik] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.3:9200]}
6909 [main] INFO  org.elasticsearch.node  - [Lunatik] started
6923 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.gateway  - [Lunatik] recovered [0] indices into cluster_state
7486 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.cluster.metadata  - [Lunatik] [titan] creating index, cause [api], templates [], shards [5]/[1], mappings []
8075 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Initiated backend operations thread pool of size 4
8241 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Configuring total store cache size: 94787290
8641 [main] INFO  com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog  - Loaded unidentified ReadMarker start time 2017-01-21T16:31:28.750Z into com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller@3520958b
8642 [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] was successfully configured via [conf/gremlin-server/srg.properties].
8643 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Initialized Gremlin thread pool.  Threads in pool named with pattern gremlin-*
14187 [main] INFO  com.jcabi.manifests.Manifests  - 108 attributes loaded from 264 stream(s) in 185ms, 108 saved, 3371 ignored: ["Agent-Class", "Ant-Version", "Archiver-Version", "Bnd-LastModified", "Boot-Class-Path", "Build-Date", "Build-Host", "Build-Id", "Build-Java-Version", "Build-Jdk", "Build-Job", "Build-Number", "Build-Time", "Build-Timestamp", "Build-Version", "Built-At", "Built-By", "Built-OS", "Built-On", "Built-Status", "Bundle-ActivationPolicy", "Bundle-Activator", "Bundle-BuddyPolicy", "Bundle-Category", "Bundle-ClassPath", "Bundle-Classpath", "Bundle-Copyright", "Bundle-Description", "Bundle-DocURL", "Bundle-License", "Bundle-Localization", "Bundle-ManifestVersion", "Bundle-Name", "Bundle-NativeCode", "Bundle-RequiredExecutionEnvironment", "Bundle-SymbolicName", "Bundle-Vendor", "Bundle-Version", "Can-Redefine-Classes", "Change", "Class-Path", "Created-By", "DynamicImport-Package", "Eclipse-AutoStart", "Eclipse-BuddyPolicy", "Eclipse-SourceReferences", "Embed-Dependency", "Embedded-Artifacts", "Export-Package", "Extension-Name", "Extension-name", "Fragment-Host", "Git-Commit-Branch", "Git-Commit-Date", "Git-Commit-Hash", "Git-Committer-Email", "Git-Committer-Name", "Gradle-Version", "Gremlin-Lib-Paths", "Gremlin-Plugin-Dependencies", "Gremlin-Plugin-Paths", "Ignore-Package", "Implementation-Build", "Implementation-Build-Date", "Implementation-Title", "Implementation-URL", "Implementation-Vendor", "Implementation-Vendor-Id", "Implementation-Version", "Import-Package", "Include-Resource", "JCabi-Build", "JCabi-Date", "JCabi-Version", "Java-Vendor", "Java-Version", "Main-Class", "Main-class", "Manifest-Version", "Maven-Version", "Module-Email", "Module-Origin", "Module-Owner", "Module-Source", "Originally-Created-By", "Os-Arch", "Os-Name", "Os-Version", "Package", "Premain-Class", "Private-Package", "Require-Bundle", "Require-Capability", "Scm-Connection", "Scm-Revision", "Scm-Url", "Specification-Title", "Specification-Vendor", "Specification-Version", "Tool", "X-Compile-Source-JDK", "X-Compile-Target-JDK", "hash", "implementation-version", "mode", "package", "url", "version"]
14842 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-jython ScriptEngine
15540 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded nashorn ScriptEngine
16076 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-python ScriptEngine
16553 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-groovy ScriptEngine
17410 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor  - Initialized gremlin-groovy ScriptEngine with scripts/empty-sample.groovy
17410 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Initialized GremlinExecutor and configured ScriptEngines.
17419 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - A GraphTraversalSource is now bound to [g] with graphtraversalsource[standardtitangraph[cassandrathrift:[cassandra]], standard]
17565 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17566 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17808 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0
17811 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0
17958 [gremlin-server-boss-1] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1.
17959 [gremlin-server-boss-1] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Channel started at port 8182.
1/21/17 4:34:20 PM =============================================================

-- Meters ----------------------------------------------------------------------
org.apache.tinkerpop.gremlin.server.GremlinServer.errors
             count = 0
         mean rate = 0.00 events/second
     1-minute rate = 0.00 events/second
     5-minute rate = 0.00 events/second
    15-minute rate = 0.00 events/second


180564 [metrics-logger-reporter-thread-1] INFO  org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics  - type=METER, name=org.apache.tinkerpop.gremlin.server.GremlinServer.errors, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second

症状

到目前为止,一切似乎都按预期工作。日志表明我能够加载srg.properties并将数据结构绑定到名为graph的变量。

当我尝试通过导出端口8182连接到Gremlin-Server实例时出现问题,例如使用gremlin-python

# executed via python 3.6.0 on the host machine, i.e. not inside of Docker
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection

g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','graph'))

产生以下异常......

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-10-59ad504f29b4> in <module>()
----> 1 g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/','g'))

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gremlin_python/driver/driver_remote_connection.py in __init__(self, url, traversal_source, username, password, loop, graphson_reader, graphson_writer)
     41         self._password = password
     42         if loop is None: self._loop = ioloop.IOLoop.current()
---> 43         self._websocket = self._loop.run_sync(lambda: websocket.websocket_connect(self.url))
     44         self._graphson_reader = graphson_reader or GraphSONReader()
     45         self._graphson_writer = graphson_writer or GraphSONWriter()

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/ioloop.py in run_sync(self, func, timeout)
    455         if not future_cell[0].done():
    456             raise TimeoutError('Operation timed out after %s seconds' % timeout)
--> 457         return future_cell[0].result()
    458 
    459     def time(self):

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/concurrent.py in result(self, timeout)
    235             return self._result
    236         if self._exc_info is not None:
--> 237             raise_exc_info(self._exc_info)
    238         self._check_done()
    239         return self._result

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/util.py in raise_exc_info(exc_info)

HTTPError: HTTP 599: Stream closed

怀疑此库特有的问题:

1)尝试使用nc

连接到websocket端口
$ nc -z -v localhost 8182
found 0 associations
found 1 connections:
     1: flags=82<CONNECTED,PREFERRED>
    outif lo0
    src ::1 port 58627
    dst ::1 port 8182
    rank info not available
    TCP aux info available

Connection to localhost port 8182 [tcp/*] succeeded!

2)尝试使用不同的客户端库连接到Gremlin-Server,即go-gremlin

测试用例:

package main

import (
    "fmt"
    "log"

    "github.com/go-gremlin/gremlin"
)

func main() {
    if err := gremlin.NewCluster("ws://localhost:8182/gremlin"); err != nil {
        log.Fatal(err)
    }

    data, err := gremlin.Query(`graph.V()`).Exec()
    if err != nil {
        log.Fatalf("Query error: %s", err)
    }

    fmt.Println(string(data))
}

输出:

$ go run cmd/test/main.go 
2017/01/21 14:47:42 Query error: unexpected EOF
exit status 1

目前的结论&amp;问题

从以前的测试中,我得出结论,这是一个应用程序级问题(即websocket或ws协议级别的问题,而不是主机或容器网络堆栈中的问题)。实际上,nc报告套接字连接成功,但在Python和Go客户端库中表面上都抱怨服务器的不适当(空)响应。

我尝试在gremlin-python和go-gremlin中删除websocket URL中的/gremlin路径,但无济于事。

我的问题是:我从哪里开始?任何建议或诊断路径都将非常感激!

1 个答案:

答案 0 :(得分:7)

主要问题是Gremlin Server配置中的host设置为默认值localhost。这只允许来自服务器本身的连接。您需要将值更改为服务器的外部IP或0.0.0.0

另一个问题是Apache TinkerPop 3.2.2提供了gremlin-python服务器插件。 Titan 1.0.0使用TinkerPop 3.0.1。我确实gremlin-python 3.2.3插件可以与Titan 1.0.0一起使用。

更新:考虑使用使用TinkerPop 3.2.3的JanusGraph 0.1.1JanusGraphforked from Titan,因此代码与更新的依赖项基本相同。