Question

我在背景上有一张粘滞便笺的图像（比如墙壁或笔记本电脑），我想检测粘滞便笺的边缘（粗略检测也可以正常工作），这样我就可以对其进行裁剪

我计划使用ImageMagick进行实际裁剪，但我仍坚持检测边缘。

理想情况下，我的输出应该为4个边界点提供4个坐标，这样我就可以在其上运行裁剪。

我该如何处理？

stickynote

Answer 1

你可以用ImageMagick做到这一点。

可以提出不同的IM方法。这是我想到的第一个算法。它假定了＆＃34;粘滞便笺＆＃34;不会在较大的图像上倾斜或旋转：

第一阶段：使用 canny边缘检测来显示便签的边缘。
第二阶段：确定边缘的坐标。

Canny边缘检测

此命令将创建描绘原始图像中所有边缘的黑白+图像：

convert                              \
  http://i.stack.imgur.com/SxrwG.png \
 -canny 0x1+10%+30%                  \
  canny-edges.png

确定边的坐标

假设图像的大小为XxY像素。然后，您可以将图片调整为1xY列和Xx1像素行，其中每个像素的颜色值是同一行中所有像素的相应像素的平均值或与相应的列/行像素相同的列。

作为下面可以看到的示例，我首先将新 canny-edges.png 的大小调整为4xY和{{1图片：

Xx4

`identify -format " %W x %H\n" canny-edges.png 400x300 convert canny-edges.png -resize 400x4\! canny-4cols.png convert canny-edges.png -resize 4x300\! canny-4rows.png`

`canny-4cols.png`

现在之前的图像可视化了将图像压缩大小调整为几列或几行像素的内容，让我们用一列和一行来完成。同时我们将输出格式更改为 text ，而不是PNG，以获取这些像素为白色的坐标：

canny-4rows.png

以下是convert canny-edges.png -resize 400x1\! canny-1col.txt convert canny-edges.png -resize 1x300\! canny-1row.txt输出的一部分：

canny-1col.txt

如您所见，文本中检测到的边缘也会影响像素的灰度值。因此，我们可以在命令中引入额外的# ImageMagick pixel enumeration: 400,1,255,gray 0,0: (0,0,0) #000000 gray(0) 1,0: (0,0,0) #000000 gray(0) 2,0: (0,0,0) #000000 gray(0) [....] 73,0: (0,0,0) #000000 gray(0) 74,0: (0,0,0) #000000 gray(0) 75,0: (10,10,10) #0A0A0A gray(10) 76,0: (159,159,159) #9F9F9F gray(159) 77,0: (21,21,21) #151515 gray(21) 78,0: (156,156,156) #9C9C9C gray(156) 79,0: (14,14,14) #0E0E0E gray(14) 80,0: (3,3,3) #030303 gray(3) 81,0: (3,3,3) #030303 gray(3) [....] 162,0: (3,3,3) #030303 gray(3) 163,0: (4,4,4) #040404 gray(4) 164,0: (10,10,10) #0A0A0A gray(10) 165,0: (7,7,7) #070707 gray(7) 166,0: (8,8,8) #080808 gray(8) 167,0: (8,8,8) #080808 gray(8) 168,0: (8,8,8) #080808 gray(8) 169,0: (9,9,9) #090909 gray(9) 170,0: (7,7,7) #070707 gray(7) 171,0: (10,10,10) #0A0A0A gray(10) 172,0: (5,5,5) #050505 gray(5) 173,0: (13,13,13) #0D0D0D gray(13) 174,0: (6,6,6) #060606 gray(6) 175,0: (10,10,10) #0A0A0A gray(10) 176,0: (10,10,10) #0A0A0A gray(10) 177,0: (7,7,7) #070707 gray(7) 178,0: (8,8,8) #080808 gray(8) [....] 319,0: (3,3,3) #030303 gray(3) 320,0: (3,3,3) #030303 gray(3) 321,0: (14,14,14) #0E0E0E gray(14) 322,0: (156,156,156) #9C9C9C gray(156) 323,0: (21,21,21) #151515 gray(21) 324,0: (159,159,159) #9F9F9F gray(159) 325,0: (10,10,10) #0A0A0A gray(10) 326,0: (0,0,0) #000000 gray(0) 327,0: (0,0,0) #000000 gray(0) [....] 397,0: (0,0,0) #000000 gray(0) 398,0: (0,0,0) #000000 gray(0) 399,0: (0,0,0) #000000 gray(0)操作，以获得纯黑+白输出：

-threshold 50%

我不会在这里引用新文本文件的内容，如果您有兴趣，可以尝试一下自己查找。相反，我会做一个快捷方式：我将像素颜色值的文本表示输出到convert canny-edges.png -resize 400x1\! -threshold 50% canny-1col.txt convert canny-edges.png -resize 1x300\! -threshold 50% canny-1row.txt，然后直接将其映射到所有非黑色像素：

<stdout>

从上面的结果你可以得出结论，四个像素坐标另一张图片里面的贴纸是：

左下角：convert canny-edges.png -resize 400x1\! -threshold 50% txt:- \ | grep -v black # ImageMagick pixel enumeration: 400,1,255,srgb 76,0: (255,255,255) #FFFFFF white 78,0: (255,255,255) #FFFFFF white 322,0: (255,255,255) #FFFFFF white 324,0: (255,255,255) #FFFFFF white convert canny-edges.png -resize 1x300\! -threshold 50% txt:- \ | grep -v black # ImageMagick pixel enumeration: 1,300,255,srgb 0,39: (255,255,255) #FFFFFF white 0,41: (255,255,255) #FFFFFF white 0,229: (255,255,255) #FFFFFF white 0,231: (255,255,255) #FFFFFF white
右上角：(323|40)

区域宽度为246像素，高度为190像素。

（ImageMagick假定其坐标系的原点位于图像的左上角。）

现在可以剪切原始图像中的便签：

(77|230)

Hough Line 检测

一旦你拥有＆＃34; canny＆＃34;你也可以对它们应用Hough Line Detection算法：

autotrace

请注意，convert \ canny-edges.png \ -background black \ -stroke red \ -hough-lines 5x5+20 \ lines.png运算符会将检测到的线从一条边（具有浮点值）扩展并绘制到原始图像的另一条边。

当前一个命令最终将行转换为PNG时，-hough-lines运算符确实生成 MVG 文件（ Magick Vector图形）内部。这意味着您实际上可以读取MVG文件的源代码，并确定每行的数学参数，这些参数在＆＃34;红线＆＃34;图像：

-hough-lines

这是更复杂的，也适用于非严格水平和/或垂直的边缘。

但您的示例图片使用水平和垂直边缘，因此您甚至可以使用简单的shell命令来发现这些。

生成的MVG文件中共有80行描述。您可以在该文件中识别所有 水平线 ：

convert              \
  canny-edges.png    \
 -hough-lines 5x5+20 \
  lines.mvg

现在识别所有 垂直线 ：

cat lines.mvg                              \
 | while read a b c d e ; do               \
     if [ x${b/0,/} == x${c/400,/} ]; then \
       echo "$a    $b    $c   $d    $e" ;  \
     fi;                                   \
   done

    line     0,39.5    400,39.5    # 249
    line     0,62.5    400,62.5    # 48
    line     0,71.5    400,71.5    # 52
    line     0,231.5   400,231.5   # 249

Answer 2

上周我遇到了类似的检测图像边界（空白）的问题，花了很多时间尝试各种方法和工具，之后最终使用熵差计算方法解决了它，所以JFYI这里是算法。

假设您要检测200x100px图像的顶部是否有边框：

获得25％高度（25px）（0：25,0：200）的上部图像
从上部末端开始，以相同的高度获得较低的部分，然后深入到图像中心（25：50,0：200）

upper and lower pieces depicted

计算两件套的熵
查找熵差并将其与当前块高度一起存储
使上部1px减去（24 px）并从p.2重复直到我们到达图像边缘（高度0） - 每次迭代调整扫描区域的大小，从而向上滑动到图像边缘
查找存储的熵差及其块高的最大值 - 如果它靠近边缘而不是图像的中心，则这是边界的中心，并且最大熵差高于预设阈值（例如0.5））

将此算法应用于图像的每一侧。

这是一段代码，用于检测图像是否具有顶部边框并找到其近似坐标（从顶部偏移），将灰度（'L'模式）枕头图像传递给扫描功能：< / p>

import numpy as np


MEDIAN = 0.5


def scan(im):
    w, h = im.size
    array = np.array(im)

    center_ = None
    diff_ = None
    for center in reversed(range(1, h // 4 + 1)):
        upper = entropy(array[0: center, 0: w].flatten())
        lower = entropy(array[center: 2 * center, 0: w].flatten())
        diff = upper / lower if lower != 0.0 else MEDIAN
        if center_ is None or diff_ is None:
            center_ = center
            diff_ = diff
        if diff < diff_:
            center_ = center
            diff_ = diff
    top = diff_ < MEDIAN and center_ < h // 4, center_, diff_

完整的源代码包含处理边框和清晰（无边框）图像的示例：https://github.com/embali/enimda/

在图像中查找边（矩形边框）

2 个答案:

Canny边缘检测

确定边的坐标

`identify -format " %W x %H\n" canny-edges.png 400x300 convert canny-edges.png -resize 400x4\! canny-4cols.png convert canny-edges.png -resize 4x300\! canny-4rows.png`

`canny-4cols.png`

更多探索选项

`convert http://i.stack.imgur.com/SxrwG.png[246x190+77+40] sticky-note.png`

Hough Line 检测