如何识别字符串中的名词并将其大写?

时间:2020-03-29 16:02:35

标签: python nlp text-processing

我使用简单的小写纯文本,没有标点符号。是否有任何库可以帮助更改大写字母,例如名词在哪里或需要在哪里?喜欢先生之后的名字等等。任何解决方案或指导提示都可能非常有帮助。 例如: 用英语用英语..用纯文本用..在几个地方都是名字。和几个名字需要大写。像

mr. john is living in canada

Mr. John is living in Canada

1 个答案:

答案 0 :(得分:1)

这是使用nltk库通过pos_tag功能识别名词的解决方法:

#Import nltk modules

import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag

text = "mr. john is living in canada"

#Define a function to extract nouns from the string

def ExtractNoun(sentence):
    sentence = nltk.word_tokenize(sentence)
    sentence = nltk.pos_tag(sentence)
    return sentence

sent = ExtractNoun(text)

#This will return a tuple of tokens and tags

print(sent)
[('mr.', 'NN'), ('john', 'NN'), ('is', 'VBZ'), ('living', 'VBG'), ('in', 'IN'), ('canada', 'NN')]

#Create a list of nouns

nn = [i[0] for i in sent if i[1] == 'NN']

#Capitalize the nouns which are matching with the list

text_cap = " ".join([x.capitalize() if x in nn else x for x in text.split()])
print(text_cap)

'Mr. John is living in Canada'

希望这行得通!

相关问题