Question

我的数据框就像

a = {'A': {0: 40.1, 1: 40.1, 2: 40.1, 3: 45.45, 4: 41.6, 5: 39.6},
     'B': {0: 41.0, 1: 43.6, 2: 41.65, 3: 47.7, 4: 46.0, 5: 42.95},
     'C': {0: 826.0, 1: 835.0, 2: 815.0, 3: 169.5, 4: 170.0, 5: 165.5},
     'D': {0: 889.0, 1: 837.0, 2: 863.3, 3: 178.8, 4: 172.9, 5: 170.0}}

a = pd.DataFrame(a)

#a
       A      B      C      D
0  40.10  41.00  826.0  889.0
1  40.10  43.60  835.0  837.0
2  40.10  41.65  815.0  863.3
3  45.45  47.70  169.5  178.8
4  41.60  46.00  170.0  172.9
5  39.60  42.95  165.5  170.0

我想将C和D列除以5，但只能达到第二个索引

在this的帮助下，我提出了

a.apply(lambda x: x/5 if 'C' in x.name or 'D' in x.name else x)

如您所想，它适用于整个专栏。

任何方式只能将它应用到第二个索引并保留inplace

Answer 1

对于默认索引，请使用loc进行选择：

a.loc[:2, ['C','D']] /= 5

<强>详细：

print (a.loc[:2, ['C','D']])
       C      D
0  826.0  889.0
1  835.0  837.0
2  815.0  863.3

所有索引值的常规解决方案（例如DatetimeIndex）按列名称使用get_indexer，使用iloc进行选择：

a.iloc[:3, a.columns.get_indexer(['C','D'])] /= 5
print (a)
       A      B      C       D
0  40.10  41.00  165.2  177.80
1  40.10  43.60  167.0  167.40
2  40.10  41.65  163.0  172.66
3  45.45  47.70  169.5  178.80
4  41.60  46.00  170.0  172.90
5  39.60  42.95  165.5  170.00

<强>详细：

print (a.iloc[:3, a.columns.get_indexer(['C','D'])])
       C      D
0  826.0  889.0
1  835.0  837.0
2  815.0  863.3

Answer 2

IIUC，只将列C和D划分为（包括）索引2，您可以这样做：

a.iloc[:3][["C", "D"]] /= 5

结果是：

       A      B      C       D
0  40.10  41.00  165.2  177.80
1  40.10  43.60  167.0  167.40
2  40.10  41.65  163.0  172.66
3  45.45  47.70  169.5  178.80
4  41.60  46.00  170.0  172.90
5  39.60  42.95  165.5  170.00

上述方法比使用apply更快，但以下是修改现有代码以获得相同结果的方法：

a.iloc[:3] = a.iloc[:3].apply(lambda x: x/5 if x.name in {"C", "D"} else x)

不同之处在于，它仅在DataFrame的一个切片上运行apply，并将输出分配回同一切片。

请注意，我们对[:3]进行切片，因为切片中不包含结束索引。更多关于understanding python's slice notation。

此外，您不必单独检查这两个条件 - 您可以使用x.name in {..}检查集合中是否包含x.name。使用set测试成员资格比使用list：Python Sets vs Lists 要快。

将lambda应用于数据帧但仅限于某些行数

2 个答案: