计算具有相同“键名”的两个不同词典的两个值的平均值?

时间:2018-07-19 18:06:22

标签: python dictionary nltk

最初我有类似列表的列表:

mult_sentences = [['Sounds like he was bound by some ridiculous systems that 
                   the company at large needs to address.',
  '                But he didn’t do anything to go above and beyond.'],
                  ['He did absolutely nothing to help me.',
                   'He submitted a report and my problem was never resolved.'],
                  ["I really don't care now.", 'Very disappointed']]

我想分析文档中每个句子的情感 ,为此,我使用了nltk的vader情感分析器,做了这样的事情:

from nltk.sentiment.vader import SentimentIntensityAnalyzer

for i,sents in enumerate(mult_sentences):
    sia = SentimentIntensityAnalyzer()

    for sent in sents:
        print(sent)
        ss = sia.polarity_scores(sent) #SS IS A DICTIONARY,STORING ALL THE SCORES IN A DICTIONARY.
        print(ss)

    print('*'*50)

以下是输出:

 Sounds like he was bound by some ridiculous systems that the company at 
 large needs to address.        
 {'neg': 0.125, 'neu': 0.75, 'pos': 0.125, 'compound': 0.0}
 But he didn’t do anything to go above and beyond.
 {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
 **************************************************
 He did absolutely nothing to help me.
 {'neg': 0.296, 'neu': 0.704, 'pos': 0.0, 'compound': -0.3657}
 He submitted a report and my problem was never resolved.
 {'neg': 0.376, 'neu': 0.624, 'pos': 0.0, 'compound': -0.497}
 **************************************************
 I really don't care now.
 {'neg': 0.492, 'neu': 0.508, 'pos': 0.0, 'compound': -0.4416}
 Very disappointed
 {'neg': 0.772, 'neu': 0.228, 'pos': 0.0, 'compound': -0.5256}

输出存储在字典的ss 中。根据此输出,我只想计算每个文档的化合物分数的平均值。< / strong>

例如,我想计算最后一个文档平均得分,我必须将第一句和第二句的复合分数相加,然后除以文档,即-0.4416-0.5256 / 2 = -0.4836

这怎么办?

2 个答案:

答案 0 :(得分:0)

from nltk.sentiment.vader import SentimentIntensityAnalyzer

for i,sents in enumerate(mult_sentences):
    sia = SentimentIntensityAnalyzer()
    compound_sum = 0
    for sent in sents:
        print(sent)
        ss = sia.polarity_scores(sent) #SS IS A DICTIONARY,STORING ALL THE SCORES IN A DICTIONARY.
        compound_sum = compound_sum + ss['compound']
        print(ss)
    average_score = compound_sum / len(sents)
    print('average_score: ', average_score)
    print('*'*50)

答案 1 :(得分:0)

对于我的方法,您应该将所有输出词典都放在一个列表中,希望这不太难。

#able to take multiple dictionaries with the same key, and average their float values
def averageDict(listOfDicts):
    avgDict = {}
    for keyname in listOfDicts[0]:#for each key of the dict, average the values
        totalValue = 0 
        totalItems = 0
        for dict in listOfDicts:#keep adding values; divide by numItems, then add the avg val to the avgDict
            totalItems += 1 
            totalValue+= dict[keyname]
        avgDict[keyname] = totalValue/totalItems
    return avgDict
#example
locA = {"rabbit":12,"tortoise":2}
locB = {"rabbit":23.4,"tortoise":24} # there can't be .4 of a rabbit, just showing it can do floats as well
locC = {"rabbit":39,"tortoise":45.4283}

wildLife = [locA,locB,locC]
print(averageDict(wildLife))