Question

我有两个dataframes. my_index包含用于进一步分析的数据，这些数据基于格式为my_index['TIME']（总长度为100.000行）的分钟数据yyyy-mm-dd hh:mm:ss。另一个数据帧release_plain在另一个（长度70）的时间范围内包含特定的日期时间（相同的时间格式）。两个DateTime均为字符串格式

现在，我想将release_plain的日期与带有my_index的日期进行匹配，如果匹配，则在新列my_index['Dummy']中将1写入前后5分钟的范围内匹配（总共11个1ns）。

到目前为止我所拥有的：

release_plain = pd.read_csv(infile)
my_index = pd.read_csv(index_file)

datetime = release_plain['Date'].astype(str) + ' ' + release_plain['Time'].astype(str)
list_datetime = list(datetime)


for date_of_interest in list_datetime:
    if my_index.loc[my_index['TIME']==date_of_interest]:
        my_index['Dummy'] == 1
    else:
        my_index['Dummy'] == 0

但这返回：

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

此外，根据我的情况，该事件只会在事件前后5分钟为特定的DateTime创建1个虚拟对象，而不会创建虚拟对象范围。

Answer 1

if my_index.loc[my_index['TIME']==date_of_interest]

这里的括号似乎没有意义，您通过的评估是key，其内容为if my_index.loc[True]:或if my_index.loc[False]，不确定是否有{{1} }，分别是keys和True，但我希望您不要这么做，也许您的意思是这样：

False

Python：基于DateTime列表创建Timerseries虚拟变量

1 个答案: