Question

syntax error snippet of the output

我使用了这段代码

Public Sub NavigateToURL1()
  driver.Get [Sheet4!B2]
End Sub

计算我的pandas数据框中每行出现mulcair的次数。我试图重复相同的内容，但是对于一组单词，例如

unclassified_df['COUNT'] = unclassified_df.tweet.str.count('mulcair')

我在某个地方看到我可以使用Liberal = ['lpc','ptlib','justin','trudeau','realchange','liberal', 'liberals', "liberal2015",'lib2015','justin2015', 'trudeau2015', 'lpc2015']及其collection.Counter(data)方法，请任何人帮助我。

Answer 1

from collections import Counter
import pandas as pd

#check frequency for the following for each row, but no repetition for row  
Liberal =  ['lpc','justin','trudeau','realchange','liberal', 'liberals', "liberal2015",       'lib2015','justin2015', 'trudeau2015', 'lpc2015']

#sample data
data = {'tweet': ['lpc living dream camerama', "jsutingnasndsa dnsadnsadnsa dsalpcdnsa",      "but", 'mulcair suggests thereslcp bad lpc blood']}

# the data frame with one coloumn tweet
df = pd.DataFrame(data,columns=['tweet'])

#no duplicates per row
print [(df.tweet.str.contains(word).sum(),word) for word in Liberal]

#captures all duplicates located in  each row
print pd.Series({w: df.tweet.str.count(w).sum() for w in Liberal})

<强>参考文献： Contains & match

计算多个单词的频率

1 个答案: