Question

我有一个这样的熊猫数据框（映射）：

{ brands.map(brand => <p>{brand}</p>) }

它有更多的行。正如您在类别列中注意到的那样，存在拼写错误的模式，例如“ A0lytics”，错误是几乎所有情况下都需要用“ na”替换“ 0”，除了Enterprise 2.0正确的1种情况。 Exterprise 2.0是整个数据集中唯一的例外。我该如何解决此问题。我尝试了多种使用替换方法的方法，但无济于事。

Answer 1

使用正则表达式。 #Lookbehind＆Lookahead

例如：

import pandas as pd
df = pd.DataFrame({"category":["All Students", "Alter0tive Medicine", "A0lytics", "Enterprise 2.0"]})
df["category"] = df["category"].str.replace(r"(?<=\w)0(?=\w)", "na")
print(df)

输出：

               category
0          All Students
1  Alternative Medicine
2             Analytics
3        Enterprise 2.0

用panda数据框中的其他字符串替换panda中的字符串出现

1 个答案: