mikedh / trimesh

First I want to thank @mikedh for an amazing library! Thanks for all the hard work you've put into this.

I am trying to write an app where I can click on the scene and remove the corresponding face. I am able to remove faces, but am having trouble removing the correct faces. After a bit of debugging, I believe I have narrowed down the problem to scene.camera_rays(). It appears that the rays returned do not actually correspond with the pixels that they say they do. I have made a small app to highlight the issue. It displays a 2x2x2 box, and whenever the mouse moves (without pressing a button), it adds a line from the camera to the mouse.

import trimesh
import numpy as np
import pyglet
from typing import *


def create_box(start: Sequence[float], end: Sequence[float]) -> np.array:
    """
    Create an array of line segments that form a box that spans from start to end

    :param start: a 3D numpy vector giving one corner of the box
    :param end: a 3D numpy vector giving the other corner of the box
    :return: a 12x2x3 numpy array giving the start and end points for the 12
        segments in the box, in no particular order
    """
    return np.array([[[start[0], start[1], start[2]], [end[0], start[1], start[2]]],
                     [[start[0], start[1], start[2]], [start[0], end[1], start[2]]],
                     [[start[0], start[1], start[2]], [start[0], start[1], end[2]]],
                     [[end[0], end[1], end[2]], [end[0], end[1], start[2]]],
                     [[end[0], end[1], end[2]], [end[0], start[1], end[2]]],
                     [[end[0], end[1], end[2]], [start[0], end[1], end[2]]],
                     [[start[0], start[1], end[2]], [start[0], end[1], end[2]]],
                     [[start[0], end[1], end[2]], [start[0], end[1], start[2]]],
                     [[start[0], end[1], start[2]], [end[0], end[1], start[2]]],
                     [[end[0], end[1], start[2]], [end[0], start[1], start[2]]],
                     [[end[0], start[1], start[2]], [end[0], start[1], end[2]]],
                     [[end[0], start[1], end[2]], [start[0], start[1], end[2]]]])


def main():
    # Display a 2x2x2 box in the middle of the scene
    box = create_box(np.array([-1, -1, -1]), np.array([1, 1, 1]))
    scene = trimesh.Scene(trimesh.load_path(box))
    viewer = scene.show(start_loop=False, callback=lambda s: None)

    @viewer.event
    def on_mouse_motion(x, y, dx, dy):
        """
        When the mouse moves in the scene, send a 5m vector from the camera towards
        the direction of the mouse
        """
        # Find rays from camera
        origins, drctns, pixels = scene.camera_rays()

        # Get index of ray corresponding to mouse location
        rows = np.where((pixels[:, 0] == x) & (pixels[:, 1] == y))
        row = rows[0][0]

        # Pull origin and direction of that ray
        origin = origins[row, :]
        drctn = drctns[row, :]

        # Display 5 meters of the ray in the scene
        ray = np.array([origin, origin + 5 * drctn])
        viewer.scene.add_geometry(trimesh.load_path(ray))

    pyglet.app.run()


if __name__ == '__main__':
    main()

In theory, moving the mouse along an edge of the box should create a line intersecting that edge. It does not. After moving the mouse, you can change the view to see that, especially near the left and right side of the screen, the ray can be something like half a meter off. Of note, the rays seem pretty good for edges near the L/R center (regardless of Y position).

It seems that there were similar issues mentioned before in #700, #1303, and #1584, with no solution yet found (though @russoale has done some good work in #700 narrowing the bug down further).

Anyone have any ideas or work-arounds?

I seem to recall camera_to_rays was originally PR'd for something other than corresponding exactly with the viewer. That being said looking at the code it looks like camera_to_rays should probably be using the intrinsic matrix rather than unitizing pixel coordinates? I'd probably start debugging by writing a few tests that compare the output rays with the intrinsic matrix, and maybe check the corners against the FOV?

PR's with tests and fixes super welcome!

Also this half-pixel offset is suspicious:

    right_top *= 1 - (1.0 / res)

Further testing reveals that the intrinsic matrix is being calculated incorrectly; shooting out 1 meter rays around the edge of the screen (based on the camera intrinsics) do properly create rays at the very top and bottom of the screen, but they are something like 175 pixels in from the L/R edge of the screen (with a resolution of 1280x660)

Okay, I may have found the bug:

trimesh/trimesh/scene/scene.py

Lines 677 to 678 in 3966450

    
           if fov is None: 
        
               fov = np.array([60, 45])

Whenever a camera is not passed into the scene, the default FOV is hardcoded. This is fine for the Y (I'm guessing OpenGL or something sets the actual FOV based on the Y), but it is off in the X.

`scene.camera_rays()` is inaccurate, especially near the left/right edges of the scene.