匹配图像到图像集合

时间:2014-08-08 07:56:08

标签: image-processing cbir

我有大量的卡片图片和一张特定卡片的照片。我可以使用哪些工具来查找哪些图像与我的图像最相似?

此处的收集样本:

这是我想要找到的内容:

5 个答案:

答案 0 :(得分:5)

新方法!

似乎以下ImageMagick命令,或者可能是它的变体,取决于查看更多选择的图像,将提取卡片顶部的措辞

convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png

占据图片的前10%和宽度的80%(从左上角开始的10%,并将其存储在crop.png中,如下所示:

enter image description here

如果您通过tessseract OCR运行如下:

tesseract crop.png agg

你得到一个名为agg.txt的文件,其中包含:

E‘ Aggressive Urge \L® E

你可以通过grep来清理,只查找彼此相邻的大写和小写字母:

grep -Eo "\<[A-Za-z]+\>" agg.txt

获取

Aggressive Urge

: - )

答案 1 :(得分:2)

感谢您发布一些照片。

我编写了一个名为Perceptual Hashing的算法,我是Neal Krawetz博士发现的。在将您的图像与卡片进行比较时,我得到以下百分比的相似度:

Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%

所以,它不是你的图像类型的理想鉴别器,但有点工作。您可能希望使用它来为您的用例量身定制。

我会为您的集合中的每个图像计算一个哈希值,一次一个,并为每个图像存储一次哈希值。然后,当您获得一张新卡时,计算其哈希值并将其与存储的卡进行比较。

#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){

   TEMP="tmp$$.png"

   # Force image to 8x8 pixels and greyscale
   convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"

   # Calculate mean brightness and correct to range 0..255
   MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)

   # Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
   hash=""
   for i in {0..7}; do
      for j in {0..7}; do
         pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
         bit="0"
         [ $pixel -gt $MEAN ] && bit="1"
         hash="$hash$bit"
      done
   done
   hex=$(echo "obase=16;ibase=2;$hash" | bc)
   printf "%016s\n" $hex
   #rm "$TEMP" > /dev/null 2>&1
}

function HammingDistance(){
   # Convert input hex strings to upper case like bc requires
   STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
   STR2=$(tr '[a-z]' '[A-Z]' <<< $2)

   # Convert hex to binary and zero left pad to 64 binary digits
   STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
   STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))

   # Calculate Hamming distance between two strings, each differing bit adds 1
   hamming=0
   for i in {0..63};do
      a=${STR1:i:1}
      b=${STR2:i:1}
      [ $a != $b ] && ((hamming++))
   done

   # Hamming distance is in range 0..64 and small means more similar
   # We want percentage similarity, so we do a little maths
   similarity=$((100-(hamming*100/64)))
   echo $similarity
}

function Usage(){
   echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
   exit 1
}

################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
   # Expecting a single image file for which to generate hash
   if [ ! -f "$1" ]; then
      echo "ERROR: File $1 does not exist" >&2
      exit 1
   fi
   PerceptualHash "$1" 
   exit 0
fi

if [ $# -eq 2 ]; then
   # Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
   if [ -f "$1" ]; then
      hash1=$(PerceptualHash "$1")
   else
      hash1=$1
   fi
   if [ -f "$2" ]; then
      hash2=$(PerceptualHash "$2")
   else
      hash2=$2
   fi
   HammingDistance $hash1 $hash2
   exit 0
fi

Usage

答案 2 :(得分:2)

我还尝试将每张图片与卡片进行归一化的互相关,如下所示:

#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do 
   cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
   compare - card.jpg -metric NCC null: 2>&1)
   echo "$cc:$i"
done | sort -n

我得到了这个输出(按匹配质量排序):

0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg

表明该卡与demystify.jpg最相关。

请注意,我将所有图像的大小调整为相同的大小,并将其对比度标准化,以便可以轻松比较它们,并最大限度地减少因对比度差异而产生的效果。缩小它们也可以减少相关所需的时间。

答案 3 :(得分:1)

我通过将图像数据作为向量排列并在收集图像向量和搜索到的图像向量之间获取内积来尝试这一点。最相似的载体将提供最高的内积。我将所有图像的大小调整为相同的大小以获得相等的长度向量,这样我就可以使用内积。此调整大小还将降低内部产品的计算成本,并给出实际图像的粗略近似值。

您可以使用Matlab或Octave快速检查。下面是Matlab / Octave脚本。我在那里添加了评论。我尝试将变量 mult 从1改为8(你可以尝试任何整数值),对于所有这些情况,图像Demystify给出了卡片图像的最高内积。对于mult = 8,我在Matlab中得到以下 ip 向量:

ip =

683007892

558305537

604013365

正如你所看到的,它为图像揭秘提供了683007892的最高内积。

% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');

% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;

% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);

% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
    double(smallAggressiveUrge(:)) ...
    double(smallAbundance(:))];

% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));

% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;

修改

我尝试了另一种方法,基本上采用参考图像和卡片图像之间的欧氏距离(l2范数),它给了我非常好的结果,我在此{{3}找到了大量参考图像(383张图像) }为您的测试卡图像。

这里没有拍摄整个图像,而是提取了包含图像的上半部分并将其用于比较。

在执行任何处理之前,在以下步骤中,所有训练图像和测试图像都调整为预定义大小

  • 从训练图像中提取图像区域
  • 对这些图像执行形态学闭合以获得粗略近似(此步骤可能不是必需的)
  • 将这些图像矢量化并存储在训练集中(即使在此方法中没有训练,我将其称为训练集)
  • 加载测试卡图像,提取图像感兴趣区域(ROI),应用关闭,然后矢量化
  • 计算每个参考图像矢量和测试图像矢量之间的欧氏距离
  • 选择最小距离项目(或前k项)

我是使用OpenCV在C ++中完成的。我还使用不同的比例包括一些测试结果。

#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>

using namespace cv;
using namespace std;

#define INPUT_FOLDER_PATH       string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH   string("Your training image folder path")

void search()
{
    WIN32_FIND_DATA ffd;
    HANDLE hFind = INVALID_HANDLE_VALUE;

    vector<Mat> images;
    vector<string> labelNames;
    int label = 0;
    double scale = .2;  // you can experiment with scale
    Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
    Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));

    // get all training samples in the directory
    hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
    if (INVALID_HANDLE_VALUE == hFind) 
    {
        cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
        return;
    } 
    do
    {
        if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
        {
            Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
            Mat re;
            resize(im, re, imgSize, 0, 0);  // resize the image

            // extract only the upper part that contains the image
            Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
            // get a coarse approximation
            morphologyEx(roi, roi, MORPH_CLOSE, kernel);

            images.push_back(roi.reshape(1)); // vectorize the roi
            labelNames.push_back(string(ffd.cFileName));
        }

    }
    while (FindNextFile(hFind, &ffd) != 0);

    // load the test image, apply the same preprocessing done for training images
    Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
    Mat re;
    resize(test, re, imgSize, 0, 0);
    Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
    morphologyEx(roi, roi, MORPH_CLOSE, kernel);
    Mat testre = roi.reshape(1);

    struct imgnorm2_t
    {
        string name;
        double norm2;
    };
    vector<imgnorm2_t> imgnorm;
    for (size_t i = 0; i < images.size(); i++)
    {
        imgnorm2_t data = {labelNames[i], 
            norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
        imgnorm.push_back(data); // store data
    }

    // sort stored data based on euclidean-distance in the ascending order
    sort(imgnorm.begin(), imgnorm.end(), 
        [] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
    for (size_t i = 0; i < imgnorm.size(); i++)
    {
        cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
    }
}

结果:

scale = 1.0;

demystify.jpg:10989.6,sylvan_basilisk.jpg:11990.7,scathe_zombies.jpg:12307.6

scale = .8;

demystify.jpg:8572.84,sylvan_basilisk.jpg:9440.18,steel_golem.jpg:9445.36

scale = .6;

demystify.jpg:6226.6,steel_golem.jpg:6887.96,sylvan_basilisk.jpg:7013.05

scale = .4;

demystify.jpg:4185.68,steel_golem.jpg:4544.64,sylvan_basilisk.jpg:4699.67

scale = .2;

demystify.jpg:1903.05,steel_golem.jpg:2154.64,sylvan_basilisk.jpg:2277.42

答案 4 :(得分:1)

如果我理解正确,你需要将它们作为图片进行比较。这里有一个非常简单但有效的解决方案 - 它被称为Sikuli

  

我可以使用哪些工具查找哪些图像与我的图像最相似?

此工具在图像处理方面非常有效,并且不仅能够查找您的卡(图像)是否与您已定义为模式的图像相似,还能搜索部分图像内容(所谓的矩形)。

默认情况下,您可以通过Python扩展它的功能。可以将任何ImageObject设置为以百分比形式接受similarity_pattern,这样您就能够精确地找到您要查找的内容。

此工具的另一大优势是您可以在一天内学习基础知识。

希望这会有所帮助。