如何获得与一个单词相关的类似单词?

时间:2018-01-01 12:22:42

标签: python nlp nltk gensim spacy

我正在尝试解决一个nlp问题,其中我有一个单词的词典,如:

list_1={'phone':'android','chair':'netflit','charger':'macbook','laptop','sony'}

现在,如果输入是'手机'我可以很容易地使用'运营商通过密钥获取电话及其数据的描述,但问题是输入是否类似于电话'或者'手机'

我想如果我输入电话'然后我得到像

这样的词
'phone' ==> 'Phones','phones','Phone','Phone's','phone's' 

我不知道哪个word2c可以使用,哪个nlp模块可以提供这样的解决方案。

第二个问题是如果我说一句“狗”的话。我可以得到像“小狗”,“小猫”,“狗狗”,“小狗”这样的词汇。等?

我尝试了类似这样的东西,但给出了同义词:

from nltk.corpus import wordnet as wn
for ss in wn.synsets('phone'): # Each synset represents a diff concept.
    print(ss)

但它的回归:

Synset('telephone.n.01')
Synset('phone.n.02')
Synset('earphone.n.01')
Synset('call.v.03')

相反,我想:

'phone' ==> 'Phones','phones','Phone','Phone's','phone's' 

1 个答案:

答案 0 :(得分:4)

WordNet索引概念(又名Synsets)而不是单词。

使用lemma_names()访问WordNet中的根词(又名Lemma)。

>>> from nltk.corpus import wordnet as wn
>>> for ss in wn.synsets('phone'): # Each synset represents a diff concept.
...     print(ss.lemma_names())
... 
['telephone', 'phone', 'telephone_set']
['phone', 'speech_sound', 'sound']
['earphone', 'earpiece', 'headphone', 'phone']
['call', 'telephone', 'call_up', 'phone', 'ring']

作为根形式或单词的引理不应该有其他词缀,因此您不会找到您在所需单词列表中列出的复数或不同形式的单词。< / strong>

另见:

此外,单词含糊不清,可能需要通过上下文或我的词性(POS)消除歧义才能获得类似的&#34;单词,例如,你看到&#34;电话&#34;在动词中的意思与电话的含义不完全相同,而在&#34;名词&#34;中。

>>> for ss in wn.synsets('phone'): # Each synset represents a diff concept.
...     print(ss.lemma_names(), '\t', ss.definition())
... 
['telephone', 'phone', 'telephone_set']      electronic equipment that converts sound into electrical signals that can be transmitted over distances and then converts received signals back into sounds
['phone', 'speech_sound', 'sound']   (phonetics) an individual sound unit of speech without concern as to whether or not it is a phoneme of some language
['earphone', 'earpiece', 'headphone', 'phone']   electro-acoustic transducer for converting electric signals into sounds; it is held over or inserted into the ear
['call', 'telephone', 'call_up', 'phone', 'ring']    get or try to get into communication (with someone) by telephone