熊猫数据框从“父级”和“子级”列中删除最后一个“ \”

时间:2018-07-29 18:55:44

标签: python regex pandas dataframe

我有一个pandas数据框,希望删除所有\之前和之后的所有内容,以便剩下的都是可执行文件。这就是我想要实现的目标:C:\Windows\System32\services.exeservices.exe

    Parent                              Child                           PID     PID System_Or_User
0   C:\Windows\System32\services.exe    C:\Windows\System32\svchost.exe 10396   752 System
1   C:\Windows\System32\services.exe    C:\Windows\System32\svchost.exe 11688   752 System
2   C:\Windows\System32\services.exe    C:\Windows\System32\svchost.exe 11624   752 System

我已经尝试了一些类似的操作,但是似乎无法解决问题,可能是因为Windows和python中使用的\不喜欢它:

PID['Parent'] = PID['Parent'].apply(lambda x: x[0].split('\ ')[-1])

PID['Parent'] = PID['Parent'].apply(lambda x: x[0].split(' \ ')[+1])

4 个答案:

答案 0 :(得分:3)

使用struct PlayerData: Decodable { let playerID: Int? let platIDRef: Int? let EnrolledDateTime: String? let PlayerType: String? let LastCheckIn: Double? let SerialNo: Int? let Offered: Double? let PlayerReferralCode: Int? let PlayerStats: Double? let PlayerUpstream: Int? } [[1,1,1532890155915,0,7618.37,1,8047.76,1,0.01197066,0],[2,1,1532890155915,0,7618.37,1,8046.87,1,0.27672821,0],[3,1,1532890155915,0,7618.37,1,8045.26,1,0.27672821,0],[4,1,1532890155915,0,7618.37,1,8038.04,1,0.11222798,0],[5,1,1532890155915,0,7618.37,1,8037.94,1,0.09222818,0],[6,1,1532890155915,0,7618.37,1,8036.3,1,0.1994227,0],[7,1,1532890155915,0,7618.37,1,8035.4,1,0.44272289,0],[8,1,1532890155915,0,7618.37,1,8034.52,1,0.04333567,0],[9,1,1532890155915,0,7618.37,1,8034.51,1,0.40587824,0],[10,1,1532890155915,0,7618.37,1,8032.3,1,0.10256018,0],[11,1,1532890155915,0,7618.37,1,8031.13,1,0.00637662,0],[12,1,1532890155915,0,7618.37,1,8031.12,1,0.46122825,0],[13,1,1532890155915,0,7618.37,1,8030.34,1,0.12184043,0],[14,1,1532890155915,0,7618.37,1,8030.08,1,0.01381566,0],[15,1,1532890155915,0,7618.37,1,8028.18,1,0.01001035,0],[16,1,1532890155915,0,7618.37,1,8028.16,1,0.37072738,0],[17,1,1532890155915,0,7618.37,1,8028.14,1,0.13835319,0],[18,1,1532890155915,0,7618.37,1,8027.71,1,0.4302254,0],[19,1,1532890155915,0,7618.37,1,8026.13,1,0.002,0],[20,1,1532890155915,0,7618.37,1,8025.71,1,0.04610317,0],[21,1,1532890155915,0,7618.37,1,8379.29,1,0.00301798,1],[22,1,1532890155915,0,7618.37,1,8379.3,1,0.0698277,1],[23,1,1532890155915,0,7618.37,1,8380.3,1,0.32028728,1],[24,1,1532890155915,0,7618.37,1,8380.63,1,0.04697547,1],[25,1,1532890155915,0,7618.37,1,8382.57,1,0.01067624,1],[26,1,1532890155915,0,7618.37,1,8385.38,1,0.01161041,1],[27,1,1532890155915,0,7618.37,1,8385.4,1,0.16014364,1],[28,1,1532890155915,0,7618.37,1,8385.52,1,0.1241647,1],[29,1,1532890155915,0,7618.37,1,8386.13,1,0.12978041,1],[30,1,1532890155915,0,7618.37,1,8386.82,1,0.36803144,1],[31,1,1532890155915,0,7618.37,1,8387.9,1,0.06358541,1],[32,1,1532890155915,0,7618.37,1,8387.94,1,0.56588357,1],[33,1,1532890155915,0,7618.37,1,8389.71,1,0.10676243,1],[34,1,1532890155915,0,7618.37,1,8390.52,1,0.05338121,1],[35,1,1532890155915,0,7618.37,1,8390.61,1,0.64999101,1],[36,1,1532890155915,0,7618.37,1,8393.96,1,0.00368495,1],[37,1,1532890155915,0,7618.37,1,8393.97,1,0.32669303,1],[38,1,1532890155915,0,7618.37,1,8394.6,1,0.01363635,1],[39,1,1532890155915,0,7618.37,1,8395.11,1,0.53381214,1],[40,1,1532890155915,0,7618.37,1,8395.79,1,0.07345255,1]] ,并通过键入str.split来转义反斜杠

str.get

答案 1 :(得分:3)

使用str.split并通过索引转义forEach来使用-选择void validateValue() throws ValidationException { for(String k : aList) { validateValue(k); } } s的最后一个值:

\

另一个类似的想法-将str.rsplitlist一起使用,以最后一个PID['Parent'] = PID['Parent'].str.split('\\').str[-1] PID['Child'] = PID['Child'].str.split('\\').str[-1] 进行拆分,以提高性能:

n=1

详细信息

\

PID['Parent'] = PID['Parent'].str.rsplit('\\', n=1).str[-1]
PID['Child'] = PID['Child'].str.rsplit('\\', n=1).str[-1]

答案 2 :(得分:3)

您正在处理路径。如果您需要跨平台解决方案,建议将拆分内容留给os.path

这应该与str.方法一样快(或更快)。

import os
df['Parent'] = [os.path.basename(v) for v in df['Parent']]
df['Child'] = [os.path.basename(v) for v in df['Child']]

或者,您可以使用os.path.split

df['Parent'] = [os.path.split(v)[-1] for v in df['Parent']]
df['Child'] = [os.path.split(v)[-1] for v in df['Child']]

答案 3 :(得分:2)

或者str.replace

df.Parent.str.replace(r"C:\\Windows\\System32\\","")
Out[25]: 
0    services.exe
1    services.exe
2    services.exe
Name: Parent, dtype: object
相关问题