我如何在python中编写函数

时间:2016-06-09 13:51:58

标签: python function

我有这个脚本,它读取文件(文件由收集的推文组成),清理它,获取频率分布并创建绘图,但现在我只能使用一个文件,我需要的是从它创建函数,到能够传递更多文件。所以我可以用更多文件创建带有freqdist结果的数据帧来绘制它

f = open(.......)
text = f.read()
text = text.lower()
for p in list(punctuation):
    text = (text.replace(p, ''))

allWords = nltk.tokenize.word_tokenize(text)
allWordDist = nltk.FreqDist(w.lower() for w in allWords)
stopwords = set(stopwords.words('english'))

allWordExceptStopDist = nltk.FreqDist(w.lower() for w in allWords if w not in stopwords)
mostCommon = allWordExceptStopDist.most_common(25)

frame = pd.DataFrame(mostCommon, columns=['word', 'frequency'])
frame.set_index('word', inplace=True)
print(frame)
histog = frame.plot(kind='barh')
plt.show()

非常感谢你的帮助!

1 个答案:

答案 0 :(得分:-1)

这是你的意思吗?

def readStuff( filename )
    with open(filename) as f:
        text = f.read()
    text = text.lower()
    for p in list(punctuation):
        text = (text.replace(p, ''))

    allWords = nltk.tokenize.word_tokenize(text)
    allWordDist = nltk.FreqDist(w.lower() for w in allWords)
    stopwords = set(stopwords.words('english'))

    allWordExceptStopDist = nltk.FreqDist(w.lower() for w in allWords if w not in stopwords)
    mostCommon = allWordExceptStopDist.most_common(25)

    frame = pd.DataFrame(mostCommon, columns=['word', 'frequency'])
    frame.set_index('word', inplace=True)
    print(frame)
    histog = frame.plot(kind='barh')
    plt.show()