我有一个非常大的数据集XposMay(125,800 000)。为了更容易,我在下面做了一个较小的版本。我想看看SomAprilMay中的哪些值小于3并将相应的列删除为零。这是我下面的代码,它不会将任何内容更改为零。
SomAprilMay=[0,0,0,1,0,1,2,3,4,15,12,14,1,10]
XposMay=[[50,51,52,53,54,55,56,57,58,59,60,61,62,63],
[50,51,52,53,54,55,56,57,58,59,60,61,62,63],
[50,51,52,53,54,55,56,57,58,59,60,61,62,63]]
Xpos1May=XposMay
a=[]
b=[]
for k in range (0,len(SomAprilMay)):
if SomAprilMay[k] < 3:
a.append(SomAprilMay[k])
b.append(k)
for m in range (0,len(XposMay)):
Xpos1May[:][b[m]]=0
自SomAprilMay的前7个和最后一个元素&lt; 3,想要的结果是:
Xpos1May = [[0,0,0,0,0,0,0,57,58,59,60,61,62,0],
[0,0,0,0,0,0,0,57,58,59,60,61,62,0],
[0,0,0,0,0,0,0,57,58,59,60,61,62,0]]
我该怎么做?
答案 0 :(得分:3)
我建议使用numpy数组来执行此任务,因为这比循环遍历整个事情要快。但是,SomAprilMay和XposMay列表的尺寸不相等,所以我假设你在那里做了一个拼写错误并在最后的1之前又增加了14个。这段代码
import numpy as np
SomAprilMay=np.array([0,0,0,1,0,1,2,3,4,15,12,14,14,1])
XposMay=np.array([[50,51,52,53,54,55,56,57,58,59,60,61,62,63],
[50,51,52,53,54,55,56,57,58,59,60,61,62,63],
[50,51,52,53,54,55,56,57,58,59,60,61,62,63]])
XposMay.T[SomAprilMay < 3] = 0
XposMay
然后产生所需的
array([[ 0, 0, 0, 0, 0, 0, 0, 57, 58, 59, 60, 61, 62, 0],
[ 0, 0, 0, 0, 0, 0, 0, 57, 58, 59, 60, 61, 62, 0],
[ 0, 0, 0, 0, 0, 0, 0, 57, 58, 59, 60, 61, 62, 0]])
答案 1 :(得分:1)
您的SomAprilMay
有13个元素,所以我添加了一个。
这是一个班轮:
SomAprilMay=[0,0,0,1,0,1,2,3,4,5,15,12,14,1]
XposMay=[[50,51,52,53,54,55,56,57,58,59,60,61,62,63],
[50,51,52,53,54,55,56,57,58,59,60,61,62,63],
[50,51,52,53,54,55,56,57,58,59,60,61,62,63]]
mask = [e < 3 for e in SomAprilMay]
Xpos1May = [[0 if mask[i] else item for i, item in enumerate(sub) ] for sub in XposMay]