代码之家 › 专栏 › 技术社区 › bfris

模板匹配:为minmaxloc创建掩码的有效方法?

opencv numpy python-3.x

bfris · 技术社区 · 7 年前

opencv中的模板匹配非常好。您可以将掩码传递给cv2.minmaxloc,这样您只需在图像的一部分中搜索(排序)所需的模板。您也可以在matchtemplate操作中使用掩码,但这只会屏蔽模板。

我想找到一个模板,我想确保这个模板在我的图像的其他区域内。

计算minmaxloc的掩码似乎有点重。也就是说,计算 精确的 面具很重。如果使用简单的方法计算遮罩,它将忽略模板的大小。

例子是有条理的。我的输入图像如下所示。他们有点做作。我想找到糖果,但前提是 完全地 在钟面的白色圆圈里。

时钟1

时钟2

模板

import numpy as np
import cv2

img = cv2.imread('clock1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

template = cv2.imread('template.png')
t_h, t_w = template.shape[0:2]  # template height and width

# find circle in gray image using Hough transform
circles = cv2.HoughCircles(gray, method = cv2.HOUGH_GRADIENT, dp = 1, 
                           minDist  = 150, param1 = 50, param2 = 70,
                           minRadius = 131, maxRadius = 200)
i = circles[0,0]
x0 = i[0]
y0 = i[1]
r  = i[2] 

# display circle on color image
cv2.circle(img,(x0, y0), r,(0,255,0),2)

# do the template match
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)

# finally, here is the part that gets tricky. we want to find highest
# rated match inside circle and we'd like to use minMaxLoc

# make mask by drawing circle on zero array
mask = np.zeros(result.shape, dtype = np.uint8)  # minMaxLoc will throw
                                                 # error w/o np.uint8
cv2.circle(mask, (x0, y0), r, color = 1, thickness = -1)

# call minMaxLoc
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result, mask = mask)

# draw found rectangle on img
if max_val > 0.4:  # use 0.4 as threshold for finding candy bar
    cv2.rectangle(img, max_loc, (max_loc[0]+t_w, max_loc[1]+t_h), (0,255,0), 4)

cv2.imwrite('output.jpg', img)

使用时钟1输出

使用时钟2输出找到糖果条尽管有一部分在圆圈外

所以为了制作一个面具,我使用了一系列的核操作。我制作了四个单独的遮罩(模板边界框的每个角各一个),然后将它们放在一起。我不知道opencv中有什么方便的函数可以为我做这个面具。我有点担心所有的阵列操作都会很昂贵。有更好的办法吗?

h, w = result.shape[0:2]

# make arrays that hold x,y coords 
grid = np.indices((h, w))
x = grid[1]
y = grid[0]

top_left_mask  = np.hypot(x - x0, y - y0) - r < 0
top_right_mask = np.hypot(x + t_w - x0, y - y0) - r < 0
bot_left_mask  = np.hypot(x - x0, y + t_h - y0) - r < 0
bot_right_mask = np.hypot(x + t_w - x0, y + t_h - y0) - r < 0

mask = np.logical_and.reduce((top_left_mask, top_right_mask, 
                              bot_left_mask, bot_right_mask))
mask = mask.astype(np.uint8)
cv2.imwrite('mask.png', mask*255)

下面是“花哨”面具的样子:

似乎是对的。由于模板形状,它不能是圆形的。如果我用这个面具运行clock2.jpg,我会得到:

它起作用了。没有发现糖果。但我希望我能用更少的代码…

编辑 : 我做了一些分析。我以“简单”和“精确”的方式运行了100个周期,并计算了每秒帧数(fps):

简易方式:12.7 fps
精确方式:7.8 fps

所以用纽比做面具要付出一些代价。这些测试是在一个相对强大的工作站上完成的。在更普通的硬件上会更难看…

1 回复 | 直到 7 年前

bfris 7 年前

方法1:cv2.matchtemplate之前的“mask”图像

只是为了好玩,我试着自己做了一个面具 cv2.matchTemplate 看看我能取得什么样的成绩。为了清楚起见,这不是一个正确的遮罩——我将所有像素设置为忽略一种颜色(黑色或白色)。这是为了避免只有tm sqdiff和tmcorru规范支持一个适当的掩码。

@alexander reynolds在评论中提出了一个很好的观点,即如果模板图像(我们试图找到的东西)有很多黑色或白色,就必须小心。对于许多问题,我们会知道 先验的 模板的外观,我们可以指定白色背景或黑色背景。

我用 cv2.multiply numpy.multiply . cv2.multiply 它的另一个优点是,它会自动将结果剪辑到0到255的范围内。

import numpy as np
import cv2
import time

img = cv2.imread('clock1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

template = cv2.imread('target.jpg')
t_h, t_w = template.shape[0:2]  # template height and width

mask_background = 'WHITE'

start_time = time.time()

for i in range(100):  # do 100 cycles for timing
    # find circle in gray image using Hough transform
    circles = cv2.HoughCircles(gray, method = cv2.HOUGH_GRADIENT, dp = 1, 
                               minDist  = 150, param1 = 50, param2 = 70,
                               minRadius = 131, maxRadius = 200)
    i = circles[0,0]
    x0 = i[0]
    y0 = i[1]
    r  = i[2] 

    # display circle on color image
    cv2.circle(img,(x0, y0), r,(0,255,0),2)

    if mask_background == 'BLACK':  # black = 0, white = 255 on grayscale
        mask = np.zeros(img.shape, dtype = np.uint8)

    elif mask_background == 'WHITE':
        mask = 255*np.ones(img.shape, dtype = np.uint8)

    cv2.circle(mask, (x0, y0), r, color = (1,1,1), thickness = -1)
    img2 = cv2.multiply(img, mask)  # element wise multiplication
                                    # values > 255 are truncated at 255
    # do the template match
    result = cv2.matchTemplate(img2, template, cv2.TM_CCOEFF_NORMED)

    # call minMaxLoc
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)

    # draw found rectangle on img
    if max_val > 0.4:
        cv2.rectangle(img, max_loc, (max_loc[0]+t_w, max_loc[1]+t_h), (0,255,0), 4)

fps = 100/(time.time()-start_time)
print('fps ', fps)

cv2.imwrite('output.jpg', img)

分析结果:

黑色背景,每秒12.3帧
白底12.1 fps

在原始问题中,使用这种方法相对于12.7 fps的性能影响很小。但是,它有一个缺点,就是它仍然会发现模板仍然停留在边缘上一点。根据问题的确切性质,这在许多应用中是可以接受的。

方法2:使用 cv2.boxFilter 为minmaxloc创建掩码

在这项技术中,我们从一个圆形掩码开始(如op中所示),然后用cv2.boxfilter修改它。我们改变了


   anchor

从内核的默认中心到左上角(0,0)

import numpy as np
import cv2
import time

img = cv2.imread('clock1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

template = cv2.imread('target.jpg')
t_h, t_w = template.shape[0:2]  # template height and width
print('t_h, t_w ', t_h, ' ', t_w)

start_time = time.time()

for i in range(100):
    # find circle in gray image using Hough transform
    circles = cv2.HoughCircles(gray, method = cv2.HOUGH_GRADIENT, dp = 1, 
                               minDist  = 150, param1 = 50, param2 = 70,
                               minRadius = 131, maxRadius = 200)
    i = circles[0,0]
    x0 = i[0]
    y0 = i[1]
    r  = i[2] 

    # display circle on color image
    cv2.circle(img,(x0, y0), r,(0,255,0),2)

    # do the template match
    result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)

    # finally, here is the part that gets tricky. we want to find highest
    # rated match inside circle and we'd like to use minMaxLoc

    # start to make mask by drawing circle on zero array
    mask = np.zeros(result.shape, dtype = np.float)  
    cv2.circle(mask, (x0, y0), r, color = 1, thickness = -1)

    mask = cv2.boxFilter(mask, 
                         ddepth = -1, 
                         ksize = (t_w, t_h), 
                         anchor = (0,0),
                         normalize = True,
                         borderType = cv2.BORDER_ISOLATED)
    # mask now contains values from zero to 1. we want to make anything
    # less than 1 equal to zero
    _, mask = cv2.threshold(mask, thresh = 0.9999, 
                        maxval = 1.0, type = cv2.THRESH_BINARY)
    mask = mask.astype(np.uint8)

    # call minMaxLoc
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result, mask = mask)

    # draw found rectangle on img
    if max_val > 0.4:
        cv2.rectangle(img, max_loc, (max_loc[0]+t_w, max_loc[1]+t_h), (0,255,0), 4)

fps = 100/(time.time()-start_time)
print('fps ', fps)

cv2.imwrite('output.jpg', img)

此代码给出的掩码与op相同,但为11.89 fps。这项技术给我们带来了更高的精确度和略高于 方法1 .