Date Time_GMTTime_IST Current
11/15/2016 5:12:27 10:42:27 26.61
11/15/2016 5:12:28 10:42:28 42.27
11/15/2016 5:12:29 10:42:29 25.48
11/15/2016 5:12:30 10:42:30 24.24
11/15/2016 5:12:31 10:42:31 25.91
11/15/2016 5:12:32 10:42:32 27.75
11/15/2016 5:12:33 10:42:33 24.46
11/15/2016 5:12:34 10:42:34 24.32
11/15/2016 5:12:35 10:42:35 24.81
11/15/2016 5:12:36 10:42:36 27.36
11/15/2016 5:12:37 10:42:37 28.2
11/15/2016 5:12:38 10:42:38 28.29
11/15/2016 5:12:39 10:42:39 26.52
11/15/2016 5:12:40 10:42:40 32.58
11/15/2016 5:12:41 10:42:41 24.24
11/15/2016 5:12:42 10:42:42 24.36
11/15/2016 5:12:43 10:42:43 26.48
11/15/2016 5:12:44 10:42:44 28.76
11/15/2016 5:12:45 10:42:45 24.51
11/15/2016 5:12:46 10:42:46 23.93
11/15/2016 5:12:47 10:42:47 25.23
11/15/2016 5:12:48 10:42:48 27.9
11/15/2016 5:12:49 10:42:49 27.84
11/15/2016 5:12:50 10:42:50 27.31
11/15/2016 5:12:51 10:42:51 29.17
11/15/2016 5:12:52 10:42:52 24
11/15/2016 5:12:53 10:42:53 32.51
11/15/2016 5:12:54 10:42:54 26.63
11/15/2016 5:12:55 10:42:55 22.34
11/15/2016 5:12:56 10:42:56 29.14
11/15/2016 5:12:57 10:42:57 46.62
11/15/2016 5:12:58 10:42:58 48.85
11/15/2016 5:12:59 10:42:59 30.59
11/15/2016 5:13:00 10:43:00 30.68
11/15/2016 5:13:01 10:43:01 30.82
11/15/2016 5:13:02 10:43:02 31.64
11/15/2016 5:13:03 10:43:03 43.91
以上是一个样本数据,数据持续数天。我必须找到当前的抑郁情绪,如image所示。如果电流长时间低于30安培,我必须检测到那种像山谷一样的凹陷。我已经研究了一段时间,我无法想到任何能够准确找到解决方案的逻辑。任何建议都表示赞赏。机器学习方法也被接受。
答案 0 :(得分:4)
您可以使用移动窗口平均方法:
选择合适的窗口宽度(在您的情况下,条目之间的差值各为1秒,因此您选择的宽度将为秒的维度)
对您的currents
列进行迭代,并根据您选择的窗口宽度计算currents
的平均值
检查它何时降至阈值以下或高于阈值,取决于其先前的状态
使用您的示例数据,可能如下所示。在此图中,原始currents
数据显示为蓝色虚线,移动平均线为粗绿线,状态更改标记为红色垂直线。
我用来生成该图像的代码是:
import matplotlib
import matplotlib.pyplot as plt
c = [26.61, 42.27, 25.48, 24.24, 25.91, 27.75, 24.46, 24.32, 24.81, 27.36, 28.2, 28.29, 26.52, 32.58, 24.24, 24.36, 26.48, 28.76, 24.51, 23.93, 25.23, 27.9, 27.84, 27.31, 29.17, 24, 32.51, 26.63, 22.34, 29.14, 46.62, 48.85, 30.59, 30.68, 30.82, 31.64, 43.91]
if __name__ == '__main__':
# Choose window width and threshold
window = 5
thres = 27.0
# Iterate and collect state changes with regard to previous state
changes = []
rolling = [None] * window
old_state = None
for i in range(window, len(c) - 1):
slc = c[i - window:i + 1]
mean = sum(slc) / float(len(slc))
state = 'good' if mean > thres else 'bad'
rolling.append(mean)
if not old_state or old_state != state:
print('Changed to {:>4s} at position {:>3d} ({:5.3f})'.format(state, i, mean))
changes.append((i, state))
old_state = state
# Plot results and state changes
plt.figure(frameon=False, figsize=(10, 8))
currents, = plt.plot(c, ls='--', label='Current')
rollwndw, = plt.plot(rolling, lw=2, label='Rolling Mean')
plt.axhline(thres, xmin=.0, xmax=1.0, c='grey', ls='-')
plt.text(40, thres, 'Threshold: {:.1f}'.format(thres), horizontalalignment='right')
for c, s in changes:
plt.axvline(c, ymin=.0, ymax=.7, c='red', ls='-')
plt.text(c, 41.5, s, color='red', rotation=90, verticalalignment='bottom')
plt.legend(handles=[currents, rollwndw], fontsize=11)
plt.grid(True)
plt.savefig('local/plot.png', dpi=72, bbox_inches='tight')
答案 1 :(得分:0)
我们可以尝试使用类似的想法找到山谷,但使用numpy
卷积:
谷点是残值很小的连续点。
import numpy as np
Import pandas as pd # read data in data frame df
w_sz = 3 # window size
ma = np.convolve(df.Current, np.ones(w_sz)/w_sz, mode='same')
resid = df.Current - ma
threshold = 1 #0.1
prob_val = np.where(abs(resid)<=threshold)
val_indices = np.where(np.diff(prob_val) != 1)[1]+1
import matplotlib.pyplot as plt
plt.plot(df.Current)
plt.plot(ma)
plt.plot(resid)
plt.axhline(0)
plt.plot(val_indices, np.zeros(len(val_indices)), 'o', color='red')
plt.legend(['Current', 'MA-smoothed', 'Residual'], loc='upper center');
plt.show()
图中显示了3个谷,每2个连续的红点之间。似乎第一个山谷只有一个红点,但实际上有两个连续的点,山谷的长度是一个。我们也可以过滤掉小长度的山谷。