我需要从句子中获取正确的主语或主语短语。 请给我一个建议或有关此手册的链接。我试着google,但是材料太复杂了(
我做了什么:
x = "The Titanic could have been saved if it wasn't for a 30-second delay in giving the order to change course after spotting the iceberg."
def get_subject(x):
nlp = spacy.load('en_core_web_trf')
doc = nlp(x)
sub_toks = [tok for tok in doc if (tok.dep_ == "nsubj")]
return sub_toks
我把“它”作为主题。它在语法上是正确的,但我想得到“泰坦尼克号”。
另外,我还有一个尝试:
x = "The Queen of England owns a McDonald's near Buckingham Palace."
def extract_proper_nouns(x):
nlp = spacy.load('en_core_web_trf')
doc = nlp(x)
pos = [tok.i for tok in doc if tok.pos_ == "PROPN"]
consecutives = []
current = []
for elt in pos:
if len(current) == 0:
current.append(elt)
else:
if current[-1] == elt - 1:
current.append(elt)
else:
consecutives.append(current)
current = [elt]
if len(current) != 0:
consecutives.append(current)
return [doc[consecutive[0]:consecutive[-1] + 1] for consecutive in consecutives]
我得到了 [Queen, England, McDonald, Buckingham Palace],但我想得到“英国女王”。 我读过有关短语合并的内容,但不明白如何使用它/...