Question

如果我收集一些实验数据并将其加载到Python中，删除“固定”数据的最有效方法是什么？以下是我所拥有的图形示例。我想删除渐变几乎为0的z数组的元素（即大约前60个元素）。

然后我会留下嘈杂的正弦曲线以便以后分析。

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0,5,60)

z = np.zeros(120)
z[0:60] = 1e-2*x + 10
z[60:120] = np.sin(x) + 0.1*np.random.randn(len(x)) + 10

# plt.figure()
# plt.plot(z)
# plt.show()

修改：试过paradiso的解决方案：z = z[ np.gradient(z) > 1E-1 ]

设置> 1e-2，> 1e-5等

的类似结果

原始数据：

before

实施解决方案后：

after

Answer 1

一个选项是使用numpy显式计算渐变（它只是使用中心差异方案），然后使用numpy的布尔索引功能（也称为索引数组）来过滤掉具有较小衍生值的索引：

import numpy as np

z = np.zeros(120)
z[0:60] = 1e-2*x + 10
z[60:120] = np.sin(x) + 0.1*np.random.randn(len(x)) + 10

z = z[ np.gradient(z) > 1E-1 ]

编辑：

我对上面演示的失败感到有些困惑 - 我无法重现它。但是，您可以通过添加仅过滤平均值附近的数据的约束来使过滤器更加健壮：

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0,5,60)

z = np.zeros(120)
z[0:60] = 1e-2*x + 10
z[60:120] = np.sin(x) + 0.1*np.random.randn(len(x)) + 10

# Take some markers of the data - first and second derivatives should     be plenty to filter out flat signals
z0 = np.mean(z)
dz = np.gradient(z)
ddz = np.gradient(dz)

plt.figure(figsize=(6, 2))
plt.subplot(1, 3, 1)

# Plot the original signal
plt.plot(z)
plt.xticks([ 30*i for i in range(5) ])

thresh = 1E-3
# First try filtering on the 1st derivative
bool_result = np.abs(dz) > thresh

plt.subplot(1, 3, 2)
plt.plot(z0+bool_result)
plt.plot(z[bool_result])
plt.yticks([]); plt.xticks([ 30*i for i in range(5) ])

# Now try filtering on both the first and proximity to the mean
bool_result = np.logical_not(np.logical_and(np.abs(dz) < thresh, np.abs(z-np.mean(z)) < .2))

plt.subplot(1, 3, 3)
plt.plot(z0+bool_result)
plt.plot(z[bool_result])
plt.yticks([]); plt.xticks([ 30*i for i in range(5) ])
plt.savefig("FilterResults.png")

以下是过滤结果（蓝色曲线显示后两幅图像中的过滤器）： Filter results

Python从数组中删除“固定”数据

1 个答案: