Retiming (General Guide)
See also the current SM64 Retiming Guide, which has similar guidelines to these and some screenshots.
Downloadable retime lab template.
Downloading
We follow a specific procedure when downloading videos to ensure different people get the exact same video, hence consistent retime results. We use yt-dlp with ffmpeg. These can be used per the SM64 guide, or by extracting the executables into program folders and putting those in the PATH system variable (if you know what that means).
For the script file (called video.cmd
in the SM64 guide), the contents should be the following. This file is called _dl.cmd
in the downloadable template above.
yt-dlp -a _list.txt --fragment-retries infinite --download-archive _index.txt
cmd /k
You should make the _list.txt
file in the same folder as the .cmd
file. This file is the input list of videos to download. yt-dlp creates the _index.txt
download archive file as it downloads videos, listing the IDs (e.g. YouTube keys) of each source video; this file is scanned when _dl.cmd
is run to prevent yt-dlp from re-downloading videos. Thus, you can run yt-dlp without worrying about duplicates in the input list, which is useful when backing up current leaderboard videos for example.
The file _dl-simulation.cmd
has the added parameters -s --force-write-archive
, which tell yt-dlp to run and build a download archive without actually downloading any videos. This is simulation mode.
The console output of the downloading process can have useful information for investigating problems, e.g. the format downloaded from YouTube. Usually, YouTube downloads will end with an ffmpeg merge; a lack of this may indicate ffmpeg is missing, which you can test by typing ffmpeg
into the console and hitting enter. You may optionally save the console output in a _dl-log.txt
file.
Underscores at the start of non-video file names just help them to appear at the top of the folder with name sorting.
Command documentation
-a _list.txt
specifies the input list.--fragment-retries infinite
forces yt-dlp to download the entire video or fail if chunks are unavailable (default behaviour is outputting a video file with chunks missing).--download-archive _index.txt
specifies the download archive.-s
enables simulation mode.--force-write-archive
enables the download archive in simulation mode.cmd /k
keeps the console open after the yt-dlp command finishes.
See the yt-dlp GitHub readme for full documentation.
Retiming | Comparison of Methods
These two methods are believed to be accurate and stable (meaning they give the exact same results on the same input file).
-
ffmpeg (frame dump) is the most transparent and accurate method. Its accuracy comes from it outputting raw timestamps directly from the original video (at least, that’s the intent of the command). It also produces a montage of reference frames from the video that can be used to verify retime results and catch human error. It’s a bit unintuitive to use vs other programs and requires downloaded video, so is recommended for permanent whole retimes of important leaderboard runs.
-
avidemux also derives timestamps directly from the original (downloaded) video, but does some processing of the video start offset that can introduce rounding error of 0.002 at most. It’s for practical purposes as accurate as ffmpeg, and while it doesn’t produce documentation like ffmpeg’s frame montages, it does allow for multi-segment spreadsheet retimes, and simpler undocumented whole-run retimes.
All other methods I’ve seen do not derive original timestamps, so should not be used in accurate contexts.
- ytd (YouTube debug info) requires no software or video download, but only approximates timestamps, and so is unstable (variation of 0.02 is possible between runs on the same frame of the same video).
VirtualDub used to be recommended, but it makes constant frame-rate assumptions that can cause instability.
Retiming | avidemux
This method of retiming should be used in most situations. It requires videos to be downloaded and AviDemux 2.6+ to be installed. The SM64 guide shows how to do it for retiming entire runs (start to finish).
For spreadsheet-based retimes, you should instead copy the timestamps from the bottom-left textbox (next to “Time”). Take care to highlight the whole timestamp (3dp) each time.
Spreadsheet cells can just have times pasted into them, but the cells must be formatted to display time. Common time formats (note that negative durations show up as 59.999 etc.):
h:mm:ss.000
for full absolute timestamps (with leading zeros);[<0.00069444]s.000;[>=0.00069444]m:ss.000
for sub-hour times (no leading zeros if the time is under a minute);s.000
: for sub-minute times, usually durations.
Retiming | ffmpeg
This method should be used for permanent retimes of entire runs. It requires videos to be downloaded and ffmpeg 6+ to be installed. The retime lab template contains a set of scripts for this method, in the retime
folder. Start by placing the subject video in the retime
folder, and set up ffmpeg – per the SM64 guide, or by extracting the executables into program folders and putting those in the PATH system variable (if you know what that means).
Instructions:
- Look thru the video to identify where the start and end frames are loosely.
- Run
_dump.ps1
(right click > Run with PowerShell), and input a start and end timestamp to specify 1.5s windows (starting at the given times) that catch the start and end frames; these windows will be dumped into images of frames. - Sift thru the images (with File Explorer preview pane), and delete all but the start/end reference frames and the frames directly before those two.
- Run
_montage.ps1
, which will tile those 4 images into a “manifest file”. The output filename has the format<vID>.<hash>
, wherevID
is the ID on the video hosting platform (used by the accurate leaderboard as a primary key for retime records) andhash
is the MD5 hash of the combined AV tracks in the video file (logged so that anyone can check ey’s downloaded and retimed the exact same video as you). - This “manifest file” is read as follows:
- Check the left frames don’t match the retiming reference frames and the right frames do;
- Check the integers in the printed text, to the left of the |, increase by 1 from left to right – these are the sequence numbers of those frames in their respective dumps, so this confirms no frame is missing;
- Log the timestamps in the printed text, to the right of the |, for the two frames on the right; the format is a 6dp decimal number.
- Upload and link the manifest in your spreadsheet, and input the logged timestamps. On the accurate leaderboard, the pastable formulas automatically extract the vID and hash, and the vID can then be used to Ctrl+F the player/date/intro-skip fields from the leaderboard. The leaderboard itself looks up retime data by vID and auto-updates an entry if its “deleted” field is set to “r”. Non-SRC run entries must be manually updated.
Demo:
These demos were made on older versions but work similarly: full, fast.
Scripts + documentation:
Script (_dump.ps1)
Write-Host "FFMPEG Frame Dump + Timestamp" -ForegroundColor Black -BackgroundColor Green
$video = Get-ChildItem . | where {$_.extension -in (".mp4",".mkv",".webm",".mov")}
If (!$video -or $video[1]) {
Write-Host "Error | There must be exactly 1 video file in this folder." -ForegroundColor Red
Read-Host -Prompt "(press enter to exit)"; Return
}
Write-Host ("Video file: $($video.Name)") -ForegroundColor Yellow
Write-Host "Will dump 1.5s segments starting from the following timestamps:" -ForegroundColor Yellow
$ss1 = Read-Host -Prompt "Input start timestamp"
$ss2 = Read-Host -Prompt "Input end timestamp"
Write-Host "Dumping frames...`n" -ForegroundColor Yellow
$drawTextParams = "fontfile=_LeelawUI.ttf: fontcolor=yellow: fontsize=36: x=3: y=3: text='%{n} | %{pts}'"
ffmpeg -ss $ss1 -t 1.5 -i $video.Name -vf drawtext=$drawTextParams -copyts -fps_mode passthrough -enc_time_base 0.001 -frame_pts 1 %08d.png
ffmpeg -ss $ss2 -t 1.5 -i $video.Name -vf drawtext=$drawTextParams -copyts -fps_mode passthrough -enc_time_base 0.001 -frame_pts 1 %08d.png
Write-Host "Frames dumped.`n" -ForegroundColor Green
Read-Host -Prompt "(press enter to exit)"
Script (_montage.ps1)
Write-Host "Hash + Montage" -ForegroundColor Black -BackgroundColor Green
$imgs = Get-ChildItem . -Filter *.png
If (!$imgs -or !$imgs[3] -or $imgs[4]) {
Write-Host "Error | There must be exactly 4 PNG files in this folder." -ForegroundColor Red
Read-Host -Prompt "(press enter to exit)"; Return
}
$video = Get-ChildItem . | where {$_.extension -in (".mp4",".mkv",".webm",".mov")}
If (!$video -or $video[1]) {
Write-Host "Error | There must be exactly 1 video file in this folder." -ForegroundColor Red
Read-Host -Prompt "(press enter to exit)"; Return
}
Write-Host ("Video file: $($video.Name)") -ForegroundColor Yellow
Write-Host "Calculating hash...`n" -ForegroundColor Yellow
$vID = [regex]::match($video.Name, '\[([^\]]+)\]').Groups[1].Value
$vID = $vID -replace '^v(\d+)$','$1' # remove preceding "v" from twitch decimal video id
$hash = ffmpeg -i $video.Name -map 0 -c copy -f md5 -v error -
$outName = "$($vID).$($hash.split('=')[1])"
ffmpeg -i $imgs[0].Name -i $imgs[1].Name -i $imgs[2].Name -i $imgs[3].Name -lavfi "xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0" -update 1 "$($outName).png"
Write-Host "Created montage of first 4 PNGs in this folder." -ForegroundColor Green
Write-Host "Output file: $($outName)" -ForegroundColor Green
Read-Host -Prompt "(press enter to exit)"
Command documentation (hash)
-i
: input file;-map 0
: selects all streams in the file (i.e. doesn’t drop any streams);-c copy
: prevents decoding the video (slow);-f md5
: output format: md5 hash;-v error
: suppresses all but output and errors;-
: outputs result to command window.
Command documentation (dump)
-ss
before-i
: seeks to timestamp of input file;-t
: duration of video to process;-i
: input file;-vf drawtext
: process video with text addition filter (parameters listed under$drawtextparams
, but it basically imprints sequence numbers and timestamps);-copyts
: copies timestamps verbatim from original video;-fps_mode passthrough
: each frame is processed, along with its timestamp;-enc_time_base 0.001
: timestamps are rescaled by 1/0.001 = 1000 (i.e. to milliseconds);-frame_pts 1
: timestamps are printed in the filename;%08d.png
: selects image encoder and filename (8-digit integer including leading zeros).
Derived from here. See also the ffmpeg documentation for more details.
Accuracy theory
The FFMPEG method extracts original timestamps from the input video. These are stored, in modern containers, per-frame as a pts
field, the timestamp at which the frame is meant to be presented to the video viewer. These are integer coefficients of a rational timebase (in seconds, constant for a given video), which is the reciprocal of an integer timescale (in Hz), denoted tbn by FFMPEG, the frequency of the grid that the video snaps frames to, typically something like 1000 or 90000. The timestamps we want are thus pts_time = pts × timebase = pts / tbn
.
There are command-line parameters that tell FFMPEG to pass thru unmodified timestamps per each source frame, and the source code shows it stores pts
and the timebase as integers/rational numbers. The drawtext video filter, which imprints pts on the dumped frames, converts both to doubles, multiplies them, then outputs them in (C99-compliant iirc) printf %.6f%
format. This (microseconds) is FFMPEG’s default format, used because it’s higher-precision than all common timebases, at minimum 1/90000 seconds.
Unfortunately, the pts filename -frame_pts 1 option is a hack that replaces what would usually be int32 sequence numbers of dumped frames with an int32-encoded pts (scaled to milliseconds via -enc_time_base 0.001
). This cannot be used with microseconds else the integers overflow for timestamps greater than 1h11m35s (assuming the integers are unsigned, half that otherwise).
So the most accurate available retiming output timestamps are the numbers visually printed on the manifests. If the overflow bug were fixed (or rounding were done to 3dp rather than 6dp) then the outputs could be encoded into the manifest filenames and read in automatically by spreadsheets.
Retiming | ytd
This method (YouTube Debug Data) should only be used for ruff retimes where accuracy is not critical. It doesn’t require videos to be downloaded, but only works for YouTube videos.
Navigate to the start and end reference frames (via frame advance – the ,
and .
keys), then right-click and select Copy Debug Info. Select the cmt
field (around 5th in the list), and delete the rest of the data. The 3dp cmt
number is the timestamp of that frame. This method is unstable, with variation of up to 0.05 typically between different attempts at retiming the same frame in the same video.
Retiming | Technique Notes
There are a few practices I recommend to deal with ambiguous situations. We are normally trying to identify exact reference frames, but may need to count from other frames if the reference frame is not visible or duplicated for any reason. So the below will talk about matching, and counting forwards and backwards. Target frames are the frames we log, so equal reference frames for in-transitions and are one after reference frames for out-transitions.
Whenever a reference frame is dropped in a capture, make a note over the relevant cell explaining the decision-making behind the time you chose. In very rare instances, it might be necessary to deviate from the following guidelines.
-
59.94/60 fps: to avoid bias, the convention is to treat the first of the two copies of each frame as the only matching frame. So always log the first one, and always count forwards/backwards from the first one. Always count unique frames, so the first of each pair.
-
Duplicated frames: if a frame is duplicated for reasons other than double frame-rate, then the first copy is still matched, and the first copy is counted backwards from, but the last copy is counted forwards from. That’s because the closer copy is a more accurate estimate of what the frame was at that (closer) timestamp.
-
Invisible frames: if a reference frame is not visible in (almost) every instance of that type of reference frame (e.g. fadein/fadeout, circle-out with very dark background), then the closest identifiable frame is matched and counted fowards/backwards from.
-
Dropped frames: if a reference frame is dropped, but is otherwise visible in (almost) every instance of that type of reference frame (e.g. pinhole, sliver), then the earliest frame matching or following the target frame is always used. In practice, this means the first visible frame of an in-transition, and the first blackout/whiteout of an out-transition, is always used. This upholds the principle of always identifying the first frame something happens.