Question

假设我有一个这样的数据框：

df = pd.DataFrame({'foo':[1, 2], 'bar': [3, 4], 'xyz': [5, 6]})

   bar  foo  xyz
0    3    1    5
1    4    2    6

我现在想把包含oo的列放在第一个位置（即第0个索引处）;此模式始终只有一列。

我目前使用filter两次和concat：

解决此问题

pd.concat([df.filter(like='oo'),  df.filter(regex='^((?!(oo)).)*$')], axis=1)

给出了所需的输出：

   foo  bar  xyz
0    1    3    5
1    2    4    6

我想知道是否有更有效的方法来做到这一点。

Answer 1

仅使用列表推导，一起加入列表并按subset选择：

a = [x for x in df.columns if 'oo' in x]
b = [x for x in df.columns if not 'oo' in x]

df = df[a + b]
print (df)
   foo  bar  xyz
0    1    3    5
1    2    4    6

Answer 2

怎么样：

df[sorted(df, key = lambda x: x not in df.filter(like="oo").columns)]

Answer 3

使用pop：

cols = list(df)
col_oo = [col for col in df.columns if 'oo' in col]
cols.insert(0, cols.pop(cols.index(col_oo[0])))
df = df.ix[:, cols]

或使用regex：

col_oo = [col for col in cols if re.search('oo', col)]

如何根据正则表达式重新排序列？

3 个答案: