我喜欢对酒店评论进行小小的情绪分析。
示例(stop_word_filter:
[“偶数”,“尽管”,“图片”,“显示”,“干净”,“房间”,“实际”,“房间”,“退出”,“脏污”,“过期”,“也” ','check','15','clock','room','not','ready','time']
我发现这与“酒店房间”和“清洁”有关。我现在想将关闭的“负”字连接到“干净”字,应该是“脏”字。 我整合了一系列否定词。 我正在等待实施...
代码:
bullets = [] #output
distances = []
bad_word_locations = []
rubrik_word_location = [] #category_word
#if there is a category word in the review
if len(rubrik_uR_list) == 1:
#get the index of that one
rubrik_word_location = stop_words_filter.index(rubrik_uR_list[0])
#go throu all negative words and if one of them in my sentence get the index of that word
for w in negativ_words_list:
if w in stop_words_filter:
bad_word_locations.append(stop_words_filter.index(w))
#NOW ITS GETTING CRUCIAL
#if we found one
if len(bad_word_locations) > 0:
#I need to some how catch now the closest word, my code is not doing this
distances.append(abs(rubrik_word_location-bad_word_locations[0]))
bullets.append(stop_words_filter[min(distances)])
#if I got more categories in one review I need to remind that somehow...
blacklist.append(stop_words_filter[min(distances)])
我知道我的编程能力很差。 衷心感谢您的帮助。
先谢谢尼克拉斯
答案 0 :(得分:0)
我自己就能弄清楚。
def getBulletPoint(pos_filter,stop_words_filter, rubrik_uR_list):
index_of_category_word = 0
distance = 0
count = 0
#Finde alle negaive Wörter
for w in negativ_words_list:
if w in stop_words_filter:
bad_words.append(w)
#finde die Indexe der negativen Wörter
for w in bad_words:
word_index.append(stop_words_filter.index(w))
if len(rubrik_uR_list) > 0 and len(bad_words) > 0 : #Wenn wir überhaupt ein Rubrikwort haben
#-------------Loop-------------------
for w in rubrik_uR_list:
saved_distance = 1000
# bullets.append(rubrik_list[count])
# bullets.append(rubrik_uR_list[count])
index_of_category_word = stop_words_filter.index(rubrik_uR_list[count])
for i in word_index:
distance = abs(i-index_of_category_word)
if distance < saved_distance:
current_bullet = (stop_words_filter[i])
saved_distance = distance
bullets.append(current_bullet)
count = count + 1
最好的问候,尼古拉斯