计算python中单词的频率

时间:2014-11-11 08:59:30

标签: python arrays string list replace

我必须计算文本文件中每个单词的频率,如果它与数组中的单词匹配但是我收到此错误 TypeError:不可用类型:' list&# 39;

import string
from collections import Counter
from array import *
cnt=Counter()
word =[ ]
word_count = [ ]
new_array =['CC','CD','DT','EX','FW','IN','JJ','JJR','JJS','LS','MD','NN','NNS','NNP','NNPS','PDT',
                       'POS','PRP','PRP','RB','RBR','RBS','RP','SYM','TO','UH','VB','VBD','VBZ','WDT','WP','WP','WRB']
file = open('output.txt', 'rU')
for line in file:
      new_line = line.replace("_"," ")
      words = new_line.split()
      word.append(words)

[(w, word.count(w)) for w in set(word) if w in new_array]

1 个答案:

答案 0 :(得分:1)

执行word.append(words)时,您将列表添加到列表中,并列出列表。 由于列表不可清除,因此列表列表无法转换为集合,并且您收到该错误。

我认为您打算改为word += words