如何从熊猫中提取特定的字符串?

时间:2019-06-08 23:11:55

标签: python pandas dataframe

这是我的数据框df:

    Repository
0   ParaskP7/android-dev-sources
1   uholeschak/ediabaslib
2   t3hk0d3/ruby_faceapp
3   prateekbh/hopon
4   c0i/cocos2d-x-v2
5   risk1996/ctg-cheat
6   GiacomoPignoni/undercover_discord_bot
7   vyasishanatc194/Crowdbotics-React-Native-Test

例如,对于第一行,我需要将'/'之后的字符串去掉,我需要提取'android-dev-sources'

import pandas as pd
df = pd.read_csv('result_refactorings.csv', sep=';')
refactoring_details = df['repository']
a=repo_Name.to_frame(name=refactoring_details)
a.repository.str.extract(r'')

问题出在提取函数上,我无法定义模式

请帮助。谢谢!

1 个答案:

答案 0 :(得分:0)

方法1:str.split

df['Repository'].str.split(r'/').str[1]

0              android-dev-sources
1                       ediabaslib
2                     ruby_faceapp
3                            hopon
4                     cocos2d-x-v2
5                        ctg-cheat
6           undercover_discord_bot
7    Crowdbotics-React-Native-Test
Name: Repository, dtype: object

方法2:str.extract

使用正则表达式:

df['Repository'].str.extract('\/(.*)')

                               0
0            android-dev-sources
1                     ediabaslib
2                   ruby_faceapp
3                          hopon
4                   cocos2d-x-v2
5                      ctg-cheat
6         undercover_discord_bot
7  Crowdbotics-React-Native-Test
相关问题