Camera Tracking Systems and their Democratisation

Centre for Creative Arts and Technologies (C-CATS), University of Surrey
SMPTE Motion Imaging Journal, January/February 2026. DOI: 10.5594/JMI.2026/9GPF670

Journal PDF

Tweet Share

Summary: This work analyses camera tracking systems used in virtual production, building an open-source OpenCV tracker and comparing it against industry-standard tools (Blender and After Effects). By varying video noise levels and resolutions, the paper evaluates performance, workflow, and the democratisation of tracking technology for the general public.

Builds a markerless OpenCV camera tracker to compete with commercial tools.
Studies the impact of Gaussian noise and resolution on tracking performance.
Assesses accessibility and workflow in Unreal Engine for qualitative evaluation.

Abstract

Computer vision encompasses the analysis, processing, and interpretation of visual data. Tracking is a subset of this field, where systems recognise salient features in a scene to determine their displacement across frames in a video stream. This facilitates automation, increases efficiency, and broadens the functionality of these systems in applications such as surveillance, medicine, and entertainment.

Recent interest in Virtual Reality (VR) and Augmented Reality (AR) has prompted the development of new camera tracking techniques. Many tools are available to analyse and process tracking information, but most are proprietary and not accessible to the general public, limiting democratisation. This paper compares three camera tracking systems: two industry-standard tools (Blender and After Effects) and a tracker built using OpenCV’s open-source tools. By examining tracking accuracy under different video resolutions and Gaussian noise levels, and by importing tracks into Unreal Engine for qualitative assessment, the study evaluates workflow, optimisation, and democratisation of camera tracking technology.

Methodology

1. Building an OpenCV Camera Tracker

The OpenCV-based tracker detects salient features using SIFT and tracks their motion via Lucas–Kanade optic flow across the video. From the resulting motion vectors, the Essential matrix and camera matrix are used to estimate camera pose, with translation and rotation vectors integrated over time to obtain trajectories in $$x$$, $$y$$, and $$z$$. Euler angles describing roll, pitch, and yaw are extracted for each frame, forming a camera path that can be imported into Unreal Engine.

2. Experimental Video Dataset

Five 10‑second videos of a hand‑held walk along a path were recorded at $$4K$$ resolution (3840 × 2160). Each video was versioned by adding Gaussian noise at 25 %, 50 %, and 75 % of pixels and by encoding at multiple resolutions (1080p, 720p, 480p). These versions allowed a systematic study of how noise and resolution influence tracking performance for each system.

3. Comparative Tracking Systems

In addition to the OpenCV tracker, the study uses:

Blender – open-source 3D software with camera tracking tools based on feature detection and optic flow.
Adobe After Effects – proprietary compositing and tracking tool widely used in industry.

All systems estimate camera motion, and the resulting tracks are evaluated using quantitative metrics (mean error per frame and standard deviation) and qualitative inspection in Unreal Engine.

4. Integration into Unreal Engine

Camera paths extracted from each tracking system are transformed to Unreal Engine’s left‑handed coordinate system. A Python API is used to create camera actors and import coordinates as keyframes. Virtual scenes are constructed to match the physical environment, enabling side‑by‑side comparison between video footage and virtual renders.

Dataset and Experimental Setup

The paper illustrates salient feature detection in outdoor scenes, versions of the dataset with varying Gaussian noise and resolution, and examples of frames used for qualitative comparison.

Graphical representation of the effect Noise % had on x,y, and z coordinates for Video 1. The original 4K video is drawn in blue. Noise levels cause deviations from the original values, thus degrading the track’s accuracy.

Graphical representation of the effect different Resolutions had on x,y, and z coordinates for video.

BibTeX

@article{MunozLopez2026CameraTracking, author = {Irene Mu{\~n}oz L{\'o}pez and Andrew Gilbert}, title = {Camera Tracking Systems and their Democratisation}, journal = {SMPTE Motion Imaging Journal}, year = {2026}, volume = {26}, number = {1}, month = {January/February}, doi = {10.5594/JMI.2026/9GPF670} }