使用rdflib.Graph获取有关具有特殊字符的名称的类别,例如“FrançoisHollande”

时间:2016-12-15 16:33:31

标签: python-2.7 url-encoding rdflib

我正在尝试从几个人那里获取DBpedia类型。我试过rdflib。这是我的代码:

from rdflib import Graph, URIRef, RDFS
from rdflib.namespace import RDF
import urllib
import re

name = u'François Hollande'

g = Graph()
uriref = URIRef("http://dbpedia.org/resource/%s" % urllib.quote(re.sub(re.compile('\s', re.U), '_', name).encode('utf-8')))
g.parse(uriref)
for s,p,o in g:
    if (p in [
         URIRef("http://www.w3.org/2002/07/owl#sameAs"),
         RDFS.seeAlso
     ]) and (uriref != o):
        g.parse(location=o)
for s,p,o in g.triples((None, RDFuriref2.type, None)):
    print o

如您所见,我试着看看数据库中是否有一些“同义词”。我这样做是因为“FrançoisHollande”上的图表将返回此状态(状态为303):

<rdf:RDF
   xmlns:owl="http://www.w3.org/2002/07/owl#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  <rdf:Description rdf:about="http://dbpedia.org/resource/Fran%C3%A7ois_Hollande">
    <owl:sameAs rdf:resource="http://dbpedia.org/resource/Fran%C3%A7ois_Hollande"/>
    <owl:sameAs rdf:resource="http://dbpedia.org/resource/François_Hollande"/>
  </rdf:Description>
</rdf:RDF>

但是,我的代码发送了一个UnicodeEncodeError。我不知道如何调用第二个资源而不提高它。如果我在python shell上手动调用它,我会得到这个:

>> g.parse('http://dbpedia.org/resource/François_Hollande')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/rdflib/graph.py", line 1029, in parse
    data=data, format=format)
  File "/usr/local/lib/python2.7/dist-packages/rdflib/parser.py", line 171, in create_input_source
    input_source = URLInputSource(absolute_location, format)
  File "/usr/local/lib/python2.7/dist-packages/rdflib/parser.py", line 100, in __init__
    file = urlopen(req)
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 400, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 418, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1174, in do_open
    h.request(req.get_method(), req.get_selector(), req.data, headers)
  File "/usr/lib/python2.7/httplib.py", line 1004, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python2.7/httplib.py", line 1038, in _send_request
    self.endheaders(body)
  File "/usr/lib/python2.7/httplib.py", line 1000, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 851, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 827, in send
    self.sock.sendall(data)
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 18: ordinal not in range(128)

感谢您的任何建议!

0 个答案:

没有答案
相关问题