代码之家 › 专栏 › 技术社区 › nona

Pyrender中的世界到像素转换

pyrender rendering computer-vision 3d

0

nona · 技术社区 · 1 年前

我正在尝试将用pyrender渲染的3D世界中的一个点转换为像素坐标。世界到相机的帧转换似乎有效,但相机到像素的帧转换是不正确的,我不知道我做错了什么。我很感激任何提示!

目标是获得世界点uvw的像素坐标uvw。目前,我做以下工作:

创建相机:

我从一个已经存在的内在矩阵(=K)中创建了一个相机。我这样做主要是为了调试,所以我可以确定K是正确的:

K = np.array([[415.69219382,   0.        , 320.        ],
   [  0.        , 415.69219382, 240.        ],
   [  0.        ,   0.        ,   1.        ]])
K = np.ascontiguousarray(K, dtype=np.float32)
p_cam = pyrender.camera.IntrinsicsCamera(fx = K[0][0], fy = [1][1], cx =[0][2],  cy = [1][2])

scene.add(p_cam, pose=cam_pose.get_transformation_matrix(x=6170., y=4210., z=60., yaw=20, pitch=0, roll=40)) # cam_pose is my own class

创建转换矩阵

我正在创建一个具有外部旋转的变换矩阵。

def get_transformation_matrix(self, x, y, z, yaw, pitch, roll):
    from scipy.spatial.transform import Rotation as R
    '''
    yaw = rotate around z axis
    pitch = rotate around y axis
    roll = rotate around x axis
    '''
    xyz = np.array([
        [x],
        [y],
        [z]
    ])
    rot = rot_matrix = R.from_euler('zyx', [yaw, pitch, roll], degrees=True).as_matrix()
    last_row = np.array([[0,0,0,1]])
    tf_m = np.concatenate((np.concatenate((rot,xyz), axis = 1), last_row), axis = 0)
    return np.ascontiguousarray(tf_m, dtype=np.float32)

渲染图像

使用创建的相机,我渲染了以下图像。我试图变换的点是屋顶的尖端,它大约有像素坐标(500160)。我在3D场景中用粉色圆柱体标记了它。

将世界转换为像素帧

from icecream import ic
K = np.concatenate((K, [[0],[0],[0]]), axis = 1)
UVW1 = [[6184],[4245],[38],[1]] #the homogeneous coordinates of the pink cylinder in the world frame
world_to_camera = np.linalg.inv(cam_pose.transformation_matrix).astype('float32') @ UVW1
ic(world_to_camera)
camera_to_pixel = K @ world_to_camera
ic(camera_to_pixel/camera_to_pixel[2]) #Transforming the homogeneous coordinates back

输出

ic| world_to_camera: array([[ 17.48892188],
                            [  7.11796755],
                            [-39.35071968],
                            [  1.        ]])

ic| camera_to_pixel/camera_to_pixel[2]: array([[135.25094424],
                                               [164.80738424],
                                               [  1.        ]])

后果

对我来说,world_To_camera姿势似乎是正确的(我可能错了)。然而,当从相机帧变换到像素帧时,x坐标(135)是错误的(y坐标(164)可能仍然有意义)。

附上一张3D场景的屏幕截图。黄色圆柱体+轴表示摄影机,而蓝色点表示我试图变换的点(渲染图像中较早的粉红色)。

所以对我来说,唯一的误差来源可能是固有矩阵,但我自己定义这个矩阵,所以我不认为它是不正确的。有什么我视而不见的吗?

0 回复 | 直到 1 年前