代码之家  ›  专栏  ›  技术社区  ›  Wizard

python opencv调整大小(插值)

  •  2
  • Wizard  · 技术社区  · 7 年前

    我正在调整图像的大小,以便在该图像中执行对象本地化。我目前正在调整边界框的大小,方法是确定将向上采样多少像素,并将一半像素添加到每个边界框坐标中。

    我的代码如下:

    def next_state(init_input,b_prime,g):
    
    
    
    
    
    
    
    
    
    
    
    
    

    def next_state(init_input, b_prime, g):
        """ 
        Returns the observable region of the next state.
    
        Formats the next state's observable region, defined
        by b_prime, to be of dimension (224, 224, 3). Adding 16
        additional pixels of context around the original bounding box.
        The ground truth box must be reformatted according to the
        new observable region.
    
        :param init_input:
            The initial input volume of the current episode.
    
        :param b_prime:
            The subsequent state's bounding box. RED
    
        :param g:
            The ground truth box of the target object. YELLOW
        """
    
        # Determine the pixel coordinates of the observable region for the following state
        context_pixels = 16
        x1 = max(b_prime[0] - context_pixels, 0)
        y1 = max(b_prime[1] - context_pixels, 0)
        x2 = min(b_prime[2] + context_pixels, IMG_SIZE)
        y2 = min(b_prime[3] + context_pixels, IMG_SIZE)
    
        # Determine observable region
        observable_region = cv2.resize(init_input[y1:y2, x1:x2], (224, 224))
    
        # Difference between crop region and image dimensions
        x1_diff = x1
        y1_diff = y1
        x2_diff = IMG_SIZE - x2
        y2_diff = IMG_SIZE - y2
    
        # Resize ground truth box
        g[0] = int(g[0] - 0.5 * x1_diff)  # x1
        g[1] = int(g[1] - 0.5 * y1_diff)  # y1
        g[2] = int(g[2] + 0.5 * x2_diff)  # x2
        g[3] = int(g[3] + 0.5 * y2_diff)  # y2
    
        return observable_region, g 
    

    0.5

    Example

    1 回复  |  直到 7 年前
        1
  •  1
  •   lenik    7 年前

    基本上,在两个维度上保持相等的比例是一个好主意,以防止圆形和方形的形状被压扁。所以首先,你必须找到刻度。通过查找边界框的最大尺寸并添加32个(两侧16个像素),可以做到这一点,因此:

    longest = max( x_size, y_size) + 32
    scale = 224.0 / longest
    

    然后通过计算边界框的中心并添加 longest 所有方向:

    center_x = (x1 + x2) / 2
    center_y = (y1 + y2) / 2
    
    org_x1 = center_x - longest/2
    org_x2 = center_x + longest/2
    
    org_y1 = center_y - longest/2
    org_y2 = center_y + longest/2
    

    然后用坐标(org_x1,org_y1,org_x2,org_y2)将矩形重新缩放为(224224)矩形,边界框的角将 16.0 * scale 从图像角偏移。


    好的,据我所见,你调整大小 init_input[y1:y2, x1:x2] 进入之内 (224,224) 想知道,地面真相区域将在哪里。好吧,原来的接地真值矩形是16个像素的角,所以你必须找到这些新的偏移量,然后你就完成了。

    x_offset = 16.0 * 224.0 / (x2-x1)
    y_offset = 16.0 * 224.0 / (y2-y1)
    

    然后,地面真值矩形的左上角位于(X_偏移,Y_偏移)和右下角位于(224-X_偏移),(224-Y_偏移)

    您可以忽略我写在除法器上面的其余代码,它是在假设您保留x/y比的情况下编写的,而您不是=)


    这是第三次尝试找出你在做什么…如果你按比例 初始输入[y1:y2,x1:x2] 进入之内 (224224) ,转换后任意随机点(x,y)的坐标可计算为:

    x_new = (x - x1) * 224.0 / (x2 - x1)
    y_new = (y - y1) * 224.0 / (y2 - y1)
    

    相对于图像大小,使用最小/最大新值可能是一个好主意,这样就不会脱离图像边界:

    x_new = max( 0, min( 224, x_new))
    y_new = max( 0, min( 224, y_new))