Question

我有一个循环x创建子集问题想要解决：

所以我有一个名为df的数据框超过300列。

我想用df col1~20 + col21创建子集df_21，用df col1~20 + col22创建子集df_22，其余200多个子集的条件相同。

df_21=df.iloc[:,:21]  #I know this code can only be used for df_21

因此每个子集将有21列。

然后我想为每个子集创建一个名为'newcol'的新col，其值来自该子集中第21列的名称。（所以df_21为df.col 21，实际为df_22为df.col 22）

colname= ''.join(list(df_21[[20]]))  #I bring out the name of col21 as a list first and then convert it to string by command ''.join

df_21['newcol'] = np.where(df_21[[20]] == "Y", colname[7:],"")

↑然后我设置了newcol的条件（第22列）：当第21列中的值==“Y”时，将column21的名称粘贴到newcol的相应单元格中，但不包括名称中的前8个字符column21

我想在第20列之后对200+列进行上述步骤，因此会有200多个子集（例如df_21~df_250）我试着做一个循环来解决它，但我失败了。

如果您有更好的解决方案，请告诉我们！谢谢！

如何使用Python中的for循环从数据框创建多个子集

0 个答案: