Neo4j多个可选匹配查询非常慢

时间:2016-09-30 17:59:21

标签: neo4j

我对neo4j很陌生,我很难优化一个返回大量节点/关系的查询。

以下查询:

MATCH (u:User)-[:CAN_ADMINISTER]->(cs:CustomerSite)
WHERE u.id="1234" WITH cs
MATCH r1=(cs)<-[:AFFECTS_SITE]-(t:Ticket)
WHERE not(t.status = "COMPLETE")
OPTIONAL MATCH r2=(t)-[:HAS_EVENTS]->(te:TicketEvent)
OPTIONAL MATCH r3=(t)-[:CREATED_BY]->(u:User)
OPTIONAL MATCH r4=(te)<-[:HAS_EVENTS]-(u2:User)
OPTIONAL MATCH r5=(t)-[:AFFECTS_SITE]->(cs)<-[:HAS_SITE]-(c:Customer)
RETURN r1, r2, r3, r4, r5

为产生~7,000行的用户运行将近一分钟。我试图重新组织它没有多大用处。以下是当前的个人资料。

enter image description here

有关可能有用的内容的任何建议吗?

1 个答案:

答案 0 :(得分:2)

我强烈建议您根据需要从OPTIONAL MATCH中收集结果,并使用WITH来分解您的查询并缩小您感兴趣的列,以便在子查询之间保持行。正如问题评论中所解释的,MATCHES和OPTIONAL MATCHES可以构建结果行,这可以使像SEEM这样的查询快速成本。

例如,我将添加注释以分析原始查询内联:

MATCH (u:User)-[:CAN_ADMINISTER]->(cs:CustomerSite)
WHERE u.id="1234" WITH cs
MATCH r1=(cs)<-[:AFFECTS_SITE]-(t:Ticket)
WHERE not(t.status = "COMPLETE")
// we have 1 row per User at a CustomerSite
OPTIONAL MATCH r2=(t)-[:HAS_EVENTS]->(te:TicketEvent)
// now, 1 row per User @ CustomerSite per TicketEvent
OPTIONAL MATCH r3=(t)-[:CREATED_BY]->(u:User)
// the above OPTIONAL MATCH had to iterate over each User/CS/TE row instead of just each distinct TICKET
OPTIONAL MATCH r4=(te)<-[:HAS_EVENTS]-(u2:User)
// now, 1 row per User @ CustomerSite per User on each Ticket Event
OPTIONAL MATCH r5=(t)-[:AFFECTS_SITE]->(cs)<-[:HAS_SITE]-(c:Customer)
// now, 1 row per User @ CustomerSite per User on each Ticket Event per Customer at each Customer Site
RETURN r1, r2, r3, r4, r5

虽然它改变了返回数据的格式,沿途进行COLLECTS,并更好地排序可选匹配,但应提高查询速度。这是你可以这样做的一种方式:

MATCH (u:User)-[:CAN_ADMINISTER]->(cs:CustomerSite)
WHERE u.id="1234" WITH cs
MATCH (cs)<-[:AFFECTS_SITE]-(t:Ticket)
WHERE not(t.status = "COMPLETE")
// should be 1 creator per ticket, so best to do this first
OPTIONAL MATCH (t)-[:CREATED_BY]->(creator:User)
OPTIONAL MATCH (cs)<-[:HAS_SITE]-(affectedCustomer:Customer)
// collection of affected customers for each ticket (and their creator) affecting a customer site
WITH cs, t, creator, COLLECT(affectedCustomer) as affectedCustomers
OPTIONAL MATCH (t)-[:HAS_EVENTS]->(te:TicketEvent)<-[:HAS_EVENTS]-(userOnEvent:User)
WITH cs, t, creator, affectedCustomers, te, COLLECT(userOnEvent) as usersOnEvent
RETURN cs, t, creator, affectedCustomers, COLLECT({ticketEvent:te, usersOnEvent:usersOnEvent}) as ticketEventsAndUsers

每一行将与客户站点的票证,票证的创建者,站点上受影响的客户的集合以及票证和事件的票证事件集合相对应,即该事件的用户。

尝试一下,看看性能如何比较。如果它看起来更好,你将不得不改变你通过返回的数据进行解析的方式,但它对于循环或两个无法处理的东西都没有什么。