Question

我正在尝试删除列的每个值中的几个单词，但没有任何反应。

stop_words = ["and","lang","naman","the","sa","ko","na",
              "yan","n","yang","mo","ung","ang","ako","ng",
              "ndi","pag","ba","on","un","Me","at","to",
              "is","sia","kaya","I","s","sla","dun","po","b","pro"
             ]

newdata['Verbatim'] = newdata['Verbatim'].replace(stop_words,'', inplace = True)

我正在尝试从替换结果中生成一个词云，但我得到的是相同的词（这并不意味着什么，但数量很大）

Answer 1

对于正则表达式<script src="https://cdnjs.cloudflare.com/ajax/libs/vue/2.5.17/vue.js"></script> <div id="app"> <basic-input :value="name"></basic-input> <p> <strong>Name:</strong> {{ name }} </p> </div>，可以将单词边界\b与连接值由|一起使用：

OR

另一种解决方案是使用pat = '|'.join(r"\b{}\b".format(x) for x in stop_words) newdata['Verbatim'] = newdata['Verbatim'].str.replace(pat, '')值，删除停用词并在lambda函数中与sapce联接起来。

split

示例：

stop_words = set(stop_words)
f = lambda x: ' '.join(w for w in x.split() if not w in stop_words)
newdata['Verbatim'] = newdata['Verbatim'].apply(f)

在熊猫中删除字符串中的特定单词

1 个答案: