Question

我有这个Greemlin查询，该查询是使用Gremlin.net驱动程序针对CosmosDB数据库发出的：

g
  .V('Alice')
  .as('v')
  .V('Bob')
  .coalesce(
     __.inE('spokeWith')
    .where(
       outV()
      .as('v')),
    addE('spokeWith')
    .property('date', '10.02.2019 20:16:38').from('v'))

这个想法是在两个节点之间添加一条不存在的边。

问题是：该查询似乎非常昂贵，因为Azure向我收取大约600-1600请求单位。这样，我很快就达到了吞吐量极限。

是否有更好的方法来表达此查询，以使其在请求单位方面更便宜？

Answer 1

我不知道CosmosDB如何提出请求单元，但我不认为有没有更有效的方法来创建边（如果不存在）。在这两种情况下，无论边缘是否已经存在，遍历实际上仅使用最少的资源：

gremlin> g.addV().property(id, 'Alice').addV().property(id, 'Bob').iterate()
gremlin> g.V('Alice').as('v'). /* CREATE */
           V('Bob').
           coalesce(inE('spokeWith').where(outV().as('v')),
                    addE('spokeWith').property('date', 'xyz').from('v')).
           profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[Alice])@[v]                                    1           1           0.056    13.06
TinkerGraphStep(vertex,[Bob])                                          1           1           0.054    12.74
CoalesceStep([[VertexStep(IN,[spokeWith],edge),...                     1           1           0.319    74.20
  VertexStep(IN,[spokeWith],edge)                                                              0.041
  WhereTraversalStep([WhereStartStep, ProfileSt...                                             0.009
  AddEdgeStep({date=[xyz], ~from=[[SelectOneSte...                     1           1           0.105
    SelectOneStep(last,v)                                              1           1           0.016
                                            >TOTAL                     -           -           0.430        -
gremlin> g.V('Alice').as('v'). /* GET */
           V('Bob').
           coalesce(inE('spokeWith').where(outV().as('v')),
                    addE('spokeWith').property('date', 'xyz').from('v')).
           profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[Alice])@[v]                                    1           1           0.056     9.68
TinkerGraphStep(vertex,[Bob])                                          1           1           0.018     3.09
CoalesceStep([[VertexStep(IN,[spokeWith],edge),...                     1           1           0.509    87.23
  VertexStep(IN,[spokeWith],edge)                                      1           1           0.044
  WhereTraversalStep([WhereStartStep, ProfileSt...                     1           1           0.414
    WhereStartStep                                                     1           1           0.036
    EdgeVertexStep(OUT)                                                1           1           0.331
    WhereEndStep(v)                                                                            0.012
                                            >TOTAL                     -           -           0.583        -

如您所见，无论如何，只有1个遍历器；它不能比这便宜。好吧，除非Bob有其他传入的spokeWith边，而且CosmosDB无法优化相邻的顶点过滤器（where(outV().as('v'))）-在这种情况下，您最终将拥有更多的遍历器。但是，我不知道CosmosDB是如何在后台处理此案的。也许尝试直接针对CosmosDB .profile()进行查询，并注意遍历器的数量。但是从TinkerPop的角度来看，遍历是尽可能的最佳。

昂贵的gremlin查询-可以提高查询效率吗？

1 个答案: