句子包含python中的确切单词

时间:2018-10-16 06:53:44

标签: python pandas word

我想返回包含搜索列表中确切单词的句子

df = pd.read_excel('C:/Test 1012/UOI.xlsx')
a = df['Content']
searchfor =['hot' ,'yes'  and 200 more words in it]
b = a[a.str.contains('|'.join(searchfor))]
print(b)

例如:

Content = ['the photo is good','nice picture'...]

结果不应该打印任何句子,但是,“照片”中包含“热”一词,结果给我的是“照片很好”。有人知道如何解决这个问题吗?我只想让结果完全包含searchfor列表中的单词。

1 个答案:

答案 0 :(得分:1)

使用为每个searchfor值添加的单词边界:

df = pd.DataFrame({'Content':['the photo is good','nice picture']})
print (df)
             Content
0  the photo is good
1       nice picture

searchfor =['hot','yes','nice']
pat = '|'.join(r"\b{}\b".format(x) for x in searchfor)


b = df.loc[df['Content'].str.contains(pat), 'Content']
#your solution
#b = a[a.str.contains(pat)]
print (b)
1    nice picture
Name: Content, dtype: object
相关问题