Question

我正在尝试生成一个列，该列在所有地方都为零，除非满足特定条件。

现在，我将现有的0和1系列保存为Series对象。我们将其称为A系列。我创建了另一个相同大小的零填充系列，将其称为B系列。我想要做的是，每当我击中A系列中1的最后一个1时，那么系列B的后六行应将0替换为1。

例如：

A系列

0 0 0 0 1个 1个 0 0 0 0 0 0 0 0 0 0 1个 1个 1个 0 0 0 0 ...

应该生产B系列

0 0 0 0 0 0 1个 1个 1个 1个 1个 1个 0 0 0 0 0 0 0 1个 1个 1个 1 ...

这是到目前为止我尝试过的：

for row in SeriesA:
    if row == 1:
        continue
    if SeriesA[row] == 1 and SeriesA[row  + 1] == 0:
        SeriesB[row]=1
        SeriesB[row+1]=1
        SeriesB[row+2]=1
        SeriesB[row+3]=1
        SeriesB[row+4]=1
        SeriesB[row+5]=1

但是，除了前五行变为1之外，这只会生成完全为零的B系列。（系列A至少在第50行之前都是零）

我认为我不了解Pandas如何进行迭代，因此不胜感激！

编辑：完整的（ish）代码

import os
import numpy as np
import pandas as pd
df = pd.read_csv("Python_Datafile.csv", names = fields) #fields is a list with names for each column, the first column is called "Date".
df["Date"] = pd.to_datetime(df["Date"], format = "%m/%Y")
df.set_index("Date", inplace = True)

Recession = df["NBER"] # This is series A

Rin6 = Recession*0 # This is series B

gps = Recession.ne(Recession.shift(1)).where(Recession.astype(bool)).cumsum()
idx = Recession[::-1].groupby(gps).idxmax()
to_one = np.hstack(pd.date_range(start=x+pd.offsets.DateOffset(months=1), freq='M', periods=6) for x in idx)
Rin6[Rin6.index.isin(to_one)]= 1

Rin6.unique() # Returns -> array([0], dtype=int64)

Answer 1

您可以使用1 + .shift为.cumsum的连续组创建ID：

gps = s.ne(s.shift(1)).where(s.astype(bool)).cumsum()

然后，您可以通过以下方式获得每个组的最后一个索引：

idx = s[::-1].groupby(gps).idxmax()

#0
#1.0     5
#2.0    18
#Name: 0, dtype: int64

使用np.hstack

冻结所有索引的列表

import numpy as np

np.hstack(np.arange(x+1, x+7, 1) for x in idx)
#array([ 6,  7,  8,  9, 10, 11, 19, 20, 21, 22, 23, 24])

在第二个系列中将那些索引设置为1：

s2[np.hstack(np.arange(x+1, x+7, 1) for x in idx)] = 1

s2.ravel()
# array([0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0.,..

根据您的评论进行更新：假设您有一个索引为Series的{{1}} s和另一个具有相同索引但全部索引的datetimes Series值是0，并且它们的频率为s2，您可以按照类似的方式进行操作：

MonthStart

有关使用条件语句遍历熊猫系列的问题

1 个答案: