This copies the video data from the Matroska document into the Track
structure that outside users have access to. Because Track can actually
represent other media types, this is set up such that the Track can hold
metadata for those other types when they are needed.
This is needed for LibWeb's HTMLMediaElement implementation.