Sparql结果不包含结果中包含的指定属性

时间:2016-09-01 04:45:28

标签: sparql dbpedia virtuoso

我有这个查询

import argparse
import requests
import json

from pytineye import TinEyeAPIRequest

tineye = TinEyeAPIRequest('http://api.tineye.com/rest/','PUBLICKEY','PRIVATEKEY')

youtube_key = "VIDEOID"

ap = argparse.ArgumentParser()
ap.add_argument("-v","--videoID",    required=True,help="The videoID of the YouTube video. For example: https://www.youtube.com/watch?v=VIDEOID")
args = vars(ap.parse_args())

video_id    = args['videoID']

#
# Retrieve the video details based on videoID
#
def youtube_video_details(video_id):

    api_url  = "https://www.googleapis.com/youtube/v3/videos?part=snippet%2CrecordingDetails&"
    api_url += "id=%s&" % video_id
    api_url += "key=%s" % youtube_key

    response = requests.get(api_url)

    if response.status_code == 200:

        results = json.loads(response.content)

        return results

    return None


print "[*] Retrieving video ID: %s" % video_id
video_data = youtube_video_details(video_id)

thumbnails = video_data['items'][0]['snippet']['thumbnails']

print "[*] Thumbnails retrieved. Now submitting to TinEye."

url_list = []

# add the thumbnails from the API to the list
for thumbnail in thumbnails:

    url_list.append(thumbnails[thumbnail]['url'])


# build the manual URLS
for count in range(4):

    url = "http://img.youtube.com/vi/%s/%d.jpg" % (video_id,count)

    url_list.append(url)


results = []

# now walk over the list of URLs and search TinEye
for url in url_list:

    print "[*] Searching TinEye for: %s" % url

    try:
        result = tineye.search_url(url)
    except:
        pass

    if result.total_results:
        results.extend(result.matches)

result_urls = []
dates       = {}

for match in results:

    for link in match.backlinks:

        if link.url not in result_urls:

            result_urls.append(link.url)
            dates[link.crawl_date] = link.url

print            
print "[*] Discovered %d unique URLs with image matches." % len(result_urls)

for url in result_urls:

    print url


oldest_date = sorted(dates.keys())

print
print "[*] Oldest match was crawled on %s at %s" % (str(oldest_date[0]),dates[oldest_date[0]])

不要考虑额外的括号,因为这是更大查询的一部分。 我想知道的是,为什么结果中包含了http://dbpedia.org/page/Academic_structure_of_the_Australian_National_University,因为我指定了PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dbpedia: <http://dbpedia.org/> PREFIX dbpedia_property: <http://dbpedia.org/property/> PREFIX dbpedia_ontology: <http://dbpedia.org/ontology/> PREFIX yago: <http://dbpedia.org/class/yago/> PREFIX schema: <http://schema.org/> SELECT * WHERE { { SELECT ?school WHERE { ?school rdf:type yago:EducationalInstitution108276342 . FILTER ( contains(str(?school), "Australia") ) } ORDER BY ?school } } 。此属性不包含在资源页面中。我正在使用此终结点:http://dbpedia.org/sparql

1 个答案:

答案 0 :(得分:3)

看起来像是Pubby Web界面或用于获取将要显示的数据的查询中的错误。

查询

SELECT * WHERE{
<http://dbpedia.org/resource/Academic_Structure_of_the_Australian_National_University> ?p ?o
}

返回必要的rdf:type语句。

另一个奇怪的事情是,即使是SPARQL DESCRIBE查询也不会返回rdd:type三元组:

DESCRIBE <http://dbpedia.org/resource/Academic_Structure_of_the_Australian_National_University>

尽管DESCIBE在规范中没有真正定义,但用户肯定会期望这些三元组。也许这种查询用于检索资源网页的数据。