Question

我在MATLAB中有一个长时间运行的功能，我试图通过添加缓存来加速，并且显着降低了我的性能。我的代码基本上是在边缘检测图像中搜索连续的“水平”线，原始代码看起来像这样：

function lineLength = getLineLength(img, startRow, startCol)
    [nRows, nCols] = size(img);
    lineLength = 0;
    if startRow < 1 || startRow > nRows
        return;
    end

    for curCol = startCol:nCols
        if img(curCol)
            lineLength = lineLength + 1;
            continue;
        elseif lineLength > 0
            lengths = zeros(2,1);
            lengths(1) = getLineLength(img, startRow - 1, curCol);
            lengths(2) = getLineLength(img, startRow + 1, curCol);
            increment = max(lengths);
            lineLength = lineLength + increment;
        end
        break; %// At this point the end of the current line has been reached
    end
end function

由于此功能的性能不是我想要的，我想我会从以下任何一点添加长度的缓存：

function lineLength = getLineLength(img, startRow, startCol)
persistent pointCache; 
    if startRow == 0 && startCol == 0
        pointCache = zeros(size(img, 1), size(img, 2), 2);
    end
    [nRows, nCols] = size(img);
    lineLength = 0;
    if startRow < 1 || startRow > nRows
        return;
    end

    for curCol = startCol:nCols
        if pointCache(startRow, curCol, 2)
            lineLength = lineLength + pointCache(startRow, curCol, 1);
            break;
        end
        if img(curCol)
            lineLength = lineLength + 1;
            continue;
        elseif lineLength > 0
            lengths = zeros(2,1);
            lengths(1) = getLineLength(img, startRow - 1, curCol);
            lengths(2) = getLineLength(img, startRow + 1, curCol);
            increment = max(lengths);
            lineLength = lineLength + increment;
        end
        break; %// At this point the end of the current line has been reached
    end
    pointCache(startRow, startCol, 1) = lineLength;
    pointCache(startRow, startCol, 2) = 1;
end function

让我感到惊讶的是，实施此缓存实际上使我的表现更糟，而不是更好。我最好的猜测是，global变量让我遇到麻烦，或者是额外的内存使用，但我没有足够的MATLAB经验知道。

编辑...

正如Gautam正确地指出原始代码中存在一个忽略递归结果的错误。这就是实际代码的作用。我确信这很明显，但MATLAB不是我的母语，所以如果有更多MATLABy方法可以做到这一点，我会喜欢这些建议。

Answer 1

我很确定全局不是问题所在，但是作为一种风格问题，你应该使用persistent来保持它从调用到调用的价值，但它是函数的本地值。

任何时候遇到性能问题，个人资料。调用profile on，然后调用您的函数，然后调用profile report。它会指出你真正的性能问题。很少直觉有利于分析问题，特别是在matlab中。你可以阅读帮助，但这是不言自明的。

Answer 2

我不清楚这个功能是做什么的。特别是，为什么递归调用getLineLength然后有效地丢弃结果（你只测试增量是否大于零）？

我猜测为什么pointCache没有帮助：你的函数可能不会使用相同的参数（startRow，startCol）重复调用自身。您是否尝试记录为特定startRow和startCol调用getLineLength的次数？

无论您的算法是什么，使用递归迭代图像都完全不适合 MATLAB的优势。如果你想要高性能：

设置算法以使用迭代而不是递归，
弄清楚如何对迭代的部分进行矢量化。

有关矢量化的一些提示：

使用sum，cumsum，diff，bsxfun和accumarray等内置函数直接在图像矩阵上运行。
图像上的复杂双重迭代计算有时可以重新表达为矩阵乘法。

Answer 3

我的猜测是你正在缓存代码的错误部分。 elseif部分的递归似乎是真正的瓶颈。整个算法对我来说有点奇怪，也许你最好尝试这样的东西（虽然我不确定这是不是你想要的）：

for every pixel p in img
  if (pixel p set)
    linelength = 1
    p2 = p
    while (pixel p2 set) and (p2 in same column as p)
      p++ // don't check lines twice
      p2++
      linelength++
    endwhile
  endif

Answer 4

我可以告诉你，你正试图找到每列的非零元素的数量，尽管代码似乎没有完全实现。会像以下那样工作：

lineLengths = max(cumsum(img~=0, 1), 1)

如果您尝试从图像中提取斑点，请考虑使用BWLABEL函数。

我是第二个关于Gautam在Matlab中通常运作良好的说法。

为什么在MATLAB中使用LONGER缓存答案？

4 个答案: