代码之家 › 专栏 › 技术社区 › Salexes

如何使用python-opencv检测图像中的矩形框

computer-vision opencv python

Salexes · 技术社区 · 1 年前

我最近才开始使用python和opencv,所以我在这方面还不是很熟。

我正在尝试检测图像列表中的灰色矩形框。

当前版本的代码运行得很好,但有时我仍然会遇到问题,而且灰色框检测不正确。

如果有比我更有经验的人能帮助我始终如一地正确检测灰色矩形框,我将不胜感激。

我的方法是最好的吗?有没有我不知道的更好的方法?

(我目前只能使用CPU)

这是代码的当前版本:

import cv2
import matplotlib.pyplot as plt
import os

def process_and_save_images(input_dir, save_subdir='processed_7'):
    # Ensure the output directory exists
    save_dir = os.path.join(input_dir, save_subdir)
    if not os.path.exists(save_dir):
        os.makedirs(save_dir)

    # List all files in the input directory
    files = [f for f in os.listdir(input_dir) if os.path.isfile(os.path.join(input_dir, f))]

    # Filter out only the PNG files (you can add other formats if needed)
    image_paths = [os.path.join(input_dir, f) for f in files if f.lower().endswith('.png')]

    for image_path in image_paths:
        # Load image, grayscale, adaptive threshold
        image = cv2.imread(image_path)
        
        # Crop the bottom third of the image
        height, width, _ = image.shape
        cropped = image[int(2*height/3):, :]
        result_cropped = cropped.copy()

        gray = cv2.cvtColor(cropped, cv2.COLOR_BGR2GRAY)
        thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 51, 9)

        # Fill rectangular contours
        cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        cnts = cnts[0] if len(cnts) == 2 else cnts[1]
        for c in cnts:
            cv2.drawContours(thresh, [c], -1, (255,255,255), -1)

        # Morph open
        kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
        opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2)

        # Closing operation
        closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel, iterations=2)

        # Find the contour with the largest area
        cnts = cv2.findContours(closing, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        cnts = cnts[0] if len(cnts) == 2 else cnts[1]

        
        if cnts:
            largest_contour = max(cnts, key=cv2.contourArea)
            
            # Draw the bounding rectangle of the largest contour
            x, y, w, h = cv2.boundingRect(largest_contour)
            cv2.rectangle(result_cropped, (x, y), (x + w, y + h), (36, 255, 12), 3)
        else:
            print(f"No contours found in {image_path}. Skipping...")

        # Use matplotlib to display the images and save them
        plt.figure(figsize=(20, 10))

        plt.subplot(1, 4, 1)
        plt.imshow(thresh, cmap='gray')
        plt.title('Thresholded Image')

        plt.subplot(1, 4, 2)
        plt.imshow(opening, cmap='gray')
        plt.title('Morphological Opening')

        plt.subplot(1, 4, 3)
        plt.imshow(closing, cmap='gray')
        plt.title('Morphological Closing')        

        plt.subplot(1, 4, 4)
        plt.imshow(cv2.cvtColor(result_cropped, cv2.COLOR_BGR2RGB))
        plt.title('Image with Largest Rectangle')

        plt.tight_layout()

        # Save the figure
        filename = os.path.basename(image_path).replace('.png', '_processed.png')
        output_path = os.path.join(save_dir, filename)
        plt.savefig(output_path)
        plt.close()  # Close the current figure to release memory

    print("Processing and saving completed.")

# Example usage:
process_and_save_images('./temporal')

这是我正在尝试代码的一组图像的结果。 https://imgur.com/a/7C78U9l

以下是原始图像: https://imgur.com/a/XKB6KHR

以下是当前方法失败的示例:

0 回复 | 直到 1 年前

Odeaxcsh 1 年前

我认为你可以使用 cv.grabCut 它有点像是为了做这样的事情。以及代码上的注释,您可以使用 cv.HoughLines 在最后阶段,找到直线而不是对你分割的额外部分敏感的轮廓。

无论如何,这是我的代码,我认为它工作得很好,但我只测试了你的一些图像。

import cv2 as cv
import numpy as np


img = cv.imread(IMAGE)

height, width, _ = img.shape

p0 = (5, int(2*height/3))
p1 = (width - 5, height - 5)

color_mask = cv.inRange(img, (15, 15, 15), (100, 100, 100))
color_mask = color_mask // 255

color_masked = img * color_mask[:, :, None]

mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)

rect = (*p0, *p1)
cv.grabCut(color_masked, mask, rect, bgdModel, fgdModel, 10, cv.GC_INIT_WITH_RECT)

grab_mask = np.where((mask == 2) | (mask == 0), 0, 1).astype(np.uint8)
masked = color_masked * grab_mask[:,:, np.newaxis]

kernel = cv.getStructuringElement(cv.MORPH_RECT, (5, 5))
closing = cv.morphologyEx(masked, cv.MORPH_CLOSE, kernel, iterations=2)
opening = cv.morphologyEx(closing, cv.MORPH_OPEN, kernel, iterations=2)



canny = cv.Canny(opening, 50, 150)
lines = cv.HoughLinesP(canny, 0.5, np.pi/360, 50)

for line in lines:
    line = line[0]
    cv.line(img, (line[0], line[1]), (line[2], line[3]), (255, 0, 0))

box = cv.boundingRect(lines.reshape(-1, 2))
cv.rectangle(img, box, (0, 0, 255))

cv.imshow('Result', img)
cv.waitKey(0)
cv.destroyAllWindows()

大部分是你的代码,但我用过 cv.garbCut 分割字幕的一部分(而不是阈值),并在最后使用 cv.HoughLines 。

您可能需要调整一些参数,并不断测试和操作代码,以使每个图像都能正常工作。尽管您可能会考虑按照M Ciel的建议使用OCR。我认为这是一种更标准的方法。