Neo4J如何汇总和计算相同的查询结果?

时间:2018-09-08 13:37:21

标签: neo4j

我正在尝试在Neo4j中进行一些文本分析,我想编写一个查询,在查询中它以降序对结果数进行排序。我的数据结构如下:

(Word)->[next]->(Word)->[Next]

我想写一个查询,说明哪个是最流行的3个单词组合,4个单词组合等。我尝试过此操作,但对于单词组合,它始终给出一个计数:

MATCH p = (w1:Word)-[r:NEXT]->(w2:Word)-[r2:NEXT]->(w3:Word)
WITH [w1.name,w2.name,w3.name] AS word_pair 
RETURN COUNT(word_pair) as frequency, word_pair
ORDER BY frequency DESC
LIMIT 50

1 个答案:

答案 0 :(得分:0)

模式的频率始终为1,因为您将有关模式的信息打包在关系的count属性中。因此,您无需计算模式的出现次数,而只需找到此属性的最小值即可:

示例数据:

UNWIND ["My cat eats fish on Saturday",
        "My Cat eats cat food on Saturdays"] AS text
WITH split(tolower(text)," ") AS words 
UNWIND range(0,size(words)-2) AS i 
  MERGE (w1:Word {name: words[i]}) 
    ON CREATE SET w1.count = 1 
    ON MATCH SET w1.count=w1.count+1 
  MERGE (w2:Word {name: words[i+1]}) 
    ON CREATE SET w2.count = 1 
    ON MATCH SET w2.count=w2.count+1 
  MERGE (w1)-[r:NEXT]->(w2) 
    ON CREATE SET r.count = 1 
    ON MATCH SET r.count=r.count+1;

查询:

MATCH p = (:Word)-[:NEXT*2]->(:Word)
WITH extract(n IN nodes(p) | n.name) AS word_pair, 
     extract(r IN relationships(p) | r.count) AS counts
UNWIND counts AS count
RETURN word_pair, 
       min(count) AS frequency
ORDER BY frequency DESC
LIMIT 50;