代码之家 › 专栏 › 技术社区 › Wizard

python opencv调整大小(插值)

image-processing opencv python

Wizard · 技术社区 · 7 年前

我正在调整图像的大小,以便在该图像中执行对象本地化。我目前正在调整边界框的大小,方法是确定将向上采样多少像素,并将一半像素添加到每个边界框坐标中。

我的代码如下:

def next_state(init_input,b_prime,g):
















def next_state(init_input, b_prime, g):
    """ 
    Returns the observable region of the next state.

    Formats the next state's observable region, defined
    by b_prime, to be of dimension (224, 224, 3). Adding 16
    additional pixels of context around the original bounding box.
    The ground truth box must be reformatted according to the
    new observable region.

    :param init_input:
        The initial input volume of the current episode.

    :param b_prime:
        The subsequent state's bounding box. RED

    :param g:
        The ground truth box of the target object. YELLOW
    """

    # Determine the pixel coordinates of the observable region for the following state
    context_pixels = 16
    x1 = max(b_prime[0] - context_pixels, 0)
    y1 = max(b_prime[1] - context_pixels, 0)
    x2 = min(b_prime[2] + context_pixels, IMG_SIZE)
    y2 = min(b_prime[3] + context_pixels, IMG_SIZE)

    # Determine observable region
    observable_region = cv2.resize(init_input[y1:y2, x1:x2], (224, 224))

    # Difference between crop region and image dimensions
    x1_diff = x1
    y1_diff = y1
    x2_diff = IMG_SIZE - x2
    y2_diff = IMG_SIZE - y2

    # Resize ground truth box
    g[0] = int(g[0] - 0.5 * x1_diff)  # x1
    g[1] = int(g[1] - 0.5 * y1_diff)  # y1
    g[2] = int(g[2] + 0.5 * x2_diff)  # x2
    g[3] = int(g[3] + 0.5 * y2_diff)  # y2

    return observable_region, g 


0.5

1 回复 | 直到 7 年前

lenik 7 年前

基本上,在两个维度上保持相等的比例是一个好主意,以防止圆形和方形的形状被压扁。所以首先,你必须找到刻度。通过查找边界框的最大尺寸并添加32个(两侧16个像素),可以做到这一点,因此:

longest = max( x_size, y_size) + 32
scale = 224.0 / longest

然后通过计算边界框的中心并添加 longest 所有方向:

center_x = (x1 + x2) / 2
center_y = (y1 + y2) / 2

org_x1 = center_x - longest/2
org_x2 = center_x + longest/2

org_y1 = center_y - longest/2
org_y2 = center_y + longest/2

然后用坐标(org_x1,org_y1,org_x2,org_y2)将矩形重新缩放为(224224)矩形,边界框的角将 16.0 * scale 从图像角偏移。

好的,据我所见,你调整大小 init_input[y1:y2, x1:x2] 进入之内 (224,224) 想知道,地面真相区域将在哪里。好吧,原来的接地真值矩形是16个像素的角,所以你必须找到这些新的偏移量,然后你就完成了。

x_offset = 16.0 * 224.0 / (x2-x1)
y_offset = 16.0 * 224.0 / (y2-y1)

然后,地面真值矩形的左上角位于(X_偏移,Y_偏移)和右下角位于(224-X_偏移),(224-Y_偏移)

您可以忽略我写在除法器上面的其余代码,它是在假设您保留x/y比的情况下编写的,而您不是=)

这是第三次尝试找出你在做什么…如果你按比例 初始输入[y1:y2,x1:x2] 进入之内 (224224) ,转换后任意随机点(x,y)的坐标可计算为:

x_new = (x - x1) * 224.0 / (x2 - x1)
y_new = (y - y1) * 224.0 / (y2 - y1)

相对于图像大小,使用最小/最大新值可能是一个好主意,这样就不会脱离图像边界:

x_new = max( 0, min( 224, x_new))
y_new = max( 0, min( 224, y_new))