Camera#

If a mutable has an attribute type of camera, it’s a camera. Typically, a simple pinhole camera model is used.

Required attributes of a camera:

Name

Type

camera_parameters

dict

camera_parameters is a dictionary with the following six required keys:

Name

Type

screen_width

int

screen_height

int

focal_length

numeric

horizontal_aperture

numeric

near_clip

numeric

far_clip

numeric

Pinhole Model#

3D objects are projected onto a 2D plane, like this:

../../_images/camera-1.png

Looking from a top view, towards the negative Y-axis:

../../_images/camera-2.png

In the picture, f, which is the distance from the camera to the projection plane, is focal_length. hA is the distance from the left edge (upper end) to the right edge (lower end) and stands for horizontal_aperture.

near_clip and far_clip define two planes perpendicular to the line of vision between which you can see things.

Assumptions#

The following assumptions are made in frame of reference and conversion from/to other common representations.

If no transform operator is applied on the camera, the Y-axis points upwards and X-axis points to the right. The camera looks towards negative direction of the Z-axis.

There are many standards out there for the pinhole camera. To avoid confusion, we define a value called the \(pinholeRatio\) such that \(pinholeRatio=\frac{2∗focalLength∗aspectRatio}{horizontalAperture}\)

If you are familiar with OpenGL/Direct3D, \(fovy\) is used more often.

\[pinholeRatio=\frac{1}{\tan(fovy/2)}\]

If you are familiar with the intrinsic matrix or the Objectron standard, \(fx, fy, cx, cy\) are used.

\[\begin{split}fx = \frac{screenWidth \cdot pinholeRatio}{2 \cdot aspectRatio}\\ fy = \frac{screenHeight \cdot pinholeRatio}{2}\\ cx = screenWidth / 2\\ cy = screenHeight / 2\end{split}\]

Here, \(aspectRatio = screenWidth/screenHeight\).