优化SPARQL查询

时间:2015-06-10 17:20:10

标签: sparql

我有查询在电影中找到类似的口味。这样,相同类型的用户的平均排名之间的差异的绝对值小于1:

SELECT ?p ?p1 ?genre
WHERE{
?p movies:hasRated ?rate.
?p1 foaf:knows ?p.
?rate movies:ratedMovie ?mov.
?rate movies:hasRating ?rating.
?mov movies:hasGenre ?genre.
?p1 movies:hasRated ?ratep1.
?ratep1 movies:ratedMovie ?movp1.
?ratep1 movies:hasRating ?ratingp1.
?movp1 movies:hasGenre ?genre.
FILTER (?p=movies:user1)
}
GROUP BY ?p ?p1 ?genre
HAVING (abs (AVG(?rating)-AVG(?ratingp1))<1.0)

我想问一下,是否可以对其进行优化?因为它看起来很糟糕(

以下是数据集的一部分,将使用它:

movies:Man_of_steel movies:hasGenre "action", "thriller" .

movies:Elysium movies:hasGenre "drama", "sci-fi" .

movies:Gravity movies:hasGenre "sci-fi", "drama" .

movies:Django_Unchained movies:hasGenre "thriller", "action" .

movies:user1 movies:hasGender "male" ;
           movies:hasAge "30"^^xsd:float ;
           movies:hasRated movies:Rating1, movies:Rating2 .

movies:Rating1 movies:ratedMovie movies:Gravity ;
               movies:hasRating "4.0"^^xsd:float .

movies:Rating2 movies:ratedMovie movies:Django_Unchained ;
               movies:hasRating "9.0"^^xsd:float .

movies:user2 movies:hasGender "female" ;
             movies:hasAge "27"^^xsd:float ;
             movies:hasRated movies:Rating3, movies:Rating4 ;
             foaf:knows movies:user1 .

movies:Rating3 movies:ratedMovie movies:Elysium ;
               movies:hasRating "3.0"^^xsd:float .

movies:Rating4 movies:ratedMovie movies:Gravity ;
               movies:hasRating "5.0"^^xsd:float .

2 个答案:

答案 0 :(得分:3)

我没有看到你的查询特别糟糕,但是因为你提到看起来不好,我希望你问的是格式化。现在可以,但是您可以删除一些变量并使用空白节点和属性路径。 E.g:

SELECT ?p ?p1 ?genre WHERE {
  values ?p { movies:user1 }

  ?p  movies:hasRated [ movies:ratedMovie/movies:hasGenre ?genre ;
                        movies:hasRating ?rating ].

  ?p1 foaf:knows ?p ;
      movies:hasRated [ movies:ratedMovie/movies:hasGenre ?genre ;
                        movies:hasRating ?ratingp1 ].
}
GROUP BY ?p ?p1 ?genre
HAVING (abs (AVG(?rating)-AVG(?ratingp1))<1.0)

答案 1 :(得分:3)

Joshua的查询的一个小替代品,它应该适用于您的Sesame数据库(这是一个包含属性路径评估中的错误的旧版本):

SELECT ?p ?p1 ?genre WHERE {

  ?p  movies:hasRated [ movies:ratedMovie [ movies:hasGenre ?genre ];
                        movies:hasRating ?rating ].

  ?p1 foaf:knows ?p ;
      movies:hasRated [ movies:ratedMovie [ movies:hasGenre ?genre ];
                        movies:hasRating ?ratingp1 ].
  FILTER (?p = movies:user1 )
}
GROUP BY ?p ?p1 ?genre
HAVING (abs (AVG(?rating)-AVG(?ratingp1))<1.0)

如您所见,类似于Joshua的查询,除了在这里我们不使用属性路径但使用另一个空白节点,并且也不使用values子句(在2.7.8中也有错误)。

我真的建议你更新你的芝麻数据库 - 在2013年发布2.7.8,我们已经修复了大量的错误(更不用说在工作台中显着改进了查询编辑器 - 它现在很好颜色和自动完成功能)。