The ffmpeg manifesto
Creating a blurred overlay
2023 Oct 29
We believe video editing software should be free and open without freemium limits on resolution.
We believe GUIs are bloatware which do not lend themselves to reusable scriptable workflows.
We believe CLIs are the ideal type of interface.
Just kidding, I’m not going to write a whole blog post in this style.
The gargantuan one-liner
Let’s say you want to take a widescreen video in 16:9
aspect ratio and post it
somewhere like TikTok or Reels. The only problem is that
those are mobile-friendly vertical video platforms, so a widescreen video isn’t
ideal. Instead of a short fat widescreen video, you need a tall skinny vertical
video.
One common hack is to overlay the original widescreen video centered overtop of
a blurred and cropped background in the correct vertical aspect ratio, 9:16
instead of 16:9
. Here’s what that looks like with some content from Fallout
4:
Figure 1a: 16:9 input |
Figure 1b: blurred overlay 9:16 output |
You can do this with a single gargantuan one-liner ffmpeg
command:
We’ll break this down step by step below.
The deceptive allure of ffmpeg
ffmpeg
is intimidating. You get lured in with straightforward conversion
commands with just a couple arguments, e.g. avi
to mp4
:
And then there are moderately more advanced commands to apply video filters,
e.g. scaling an input resolution down or up to HD resolution 1920:1080
:
And finally there are gargantuan one-liners like ours, similar to what you might
find on stack
overflow
posts from ffmpeg
wizards.
How does anyone come up with these? What seemed like a problem of crafting a
few command line arguments has evolved into a task of writing a weird
interpretted programming language, complete with variables like [fg]
and
[bg]
, mathematical operations and the ceil()
function, and bizarrely
abbreviated constant variables like ih
and main_h
.
But before we dive into the complexity, let’s help you help yourself with
ffmpeg
.
Getting help with ffmpeg
To get top-level help, use ffmpeg --help
or ffmpeg -h
:
There’s a long verbose banner with version and build info before the actual help
starts, and a lot of other “basic” options which I’ve omitted. For someone used
to keyword arguments which don’t depend on position, it can be confusing that
there are separate [infile options]
and [outfile options]
which do depend
on where they are positioned in the argument list.
To get help with filters, use ffmpeg -filters
:
It feels inconsistent, but notice there’s no -h
in the above help command for
filters. Again, I’ve omitted everything that’s irrelevant to this blog.
Note that most of these filters map a single video input to a single video
output V->V
. The exception is overlay
, which maps two inputs to one output
VV->V
.
Finally, to get help with a specific filter, e.g. crop
, use ffmpeg -h filter=crop
:
The online ffmpeg help can be better at providing examples for filters like this. Putting all the help together, here’s an example crop command:
This crops the input to a width of 608 and a height of 1080.
Breaking down the one-liner
I’ll be honest, it took me a couple hours of trial and error to settle on this one-liner. I started step by step with a workflow that used 4 separate commands:
- Crop the background to a vertical aspect ratio
- Blur the background
- Scale the foreground down to match the background’s width
- Overlay the foreground onto the background
Let’s first translate those steps in the most straightforward way to ffmpeg
commands, trading elegance for simplicity. We’ll get it working before we get
it working elegantly, efficiently, and generally.
Fixed 1920:1080 resolution
Here are those steps translated into ffmpeg
commands, assuming the input has
dimensions 1920:1080
:
The short option -vf
is an alias of the long options -filter
, -filter:v
,
or -filter:video
. Why specify video
? Because ffmpeg
can filter audio
too!
Every video filter has arguments. For example, the crop
filter takes width
and height
arguments with the syntax crop=width:height
, e.g.
crop=608:1080
. This filter takes the 1920:1080
input (16:9
aspect) and
crops to the middle 608:1080
(9:16
aspect). The crop
filter has other
optional arguments listed in the help above that can crop other areas besides
the middle. We’ll figure out how to generalize this to other input dimensions
with the same aspect ratio below.
The boxblur
filter has an argument to control the blurriness.
The scale
filter downsizes the input to 608:342
, the same width as the
crop (608
) with the original 16:9
aspect ratio.
Finally, the overlay
filter lays the scaled video over the middle of the
blurred video, at a vertical y
position 369
. We have to do a little math to
figure out the middle position. Here it is half the blur height minus half the
overlay height, i.e. 0.5 * 1080 - 0.5 * 342 = 369
. If you’re worried about
whether y
is measured from the top or from the bottom, it doesn’t matter.
General 16:9 aspect ratio
Instead of hard-coding dimensions for a 1920:1080
input, these can be
generalized to work with any 16:9
aspect ratio, e.g. 1280:720
, HD
1920:1080
, 4K 3840:2160
, etc.
This can be achieved by using the named variables accepted by many filters in
ffmpeg
, e.g. ih
for the height of the input frame. I might call these
constants, as they don’t really change on the fly and they’re distinct from
labels which we’ll get to later, but ffmpeg
calls them variables anyway
instead of constants.
Many filters accept ih
for the input height and iw
for the input width. The
overlay
filter is slightly more complicated because there are two inputs and
thus two heights: main_h
and overlay_h
. If you get the main and overlay
inputs mixed up like I do, just remember that the overlay input gets layed over
the main input.
I should explain the funny ceil()
function in the crop
and scale
filters, e.g. crop=ceil(ih*9/16/2)*2:ih
. Ideally we would want a cropped
width of just ih*9/16
, but some formats and codecs will error out if you try
to use an odd numbered dimension. To guarantee an even number, we can divide by
2, round up, and multiply by 2 again, hence ceil(ih*9/16/2)*2
. The scale
filter is similar.
Further generalization to other aspect ratio inputs and outputs like 4:3
is
left as an exercise for the reader.
Chaining steps together into a filtergraph
The workflow above leaves a few temporary files crop.mp4
, blur.mp4
, and
scale.mp4
which can be cleaned up. The gargantuan one-liner is better because
it does not waste any time reading, writing, and transcoding temporary files.
In a benchmark that I ran, the whole process takes 1m57s as four separate
commands but just 0m44s as a one-liner. If you want the details of that
benchmark, the input video was a 1 minute long, 60 fps, 1920:1080
, 20 MB mp4
file.
Again, take things one step at a time when chaining together these gargantuan one-liners. To start, the first two steps of cropping and blurring can be combined:
The same filters in a single step look like this:
The video filters -vf
are quoted in a single string, separated by commas, e.g.
"filter1, filter2"
, which becomes "crop=ceil(ih*9/16/2)*2:ih, boxblur=30"
.
The ffmpeg
filtering
introduction
explains this excellently:
Filters in the same linear chain are separated by commas, and distinct linear chains of filters are separated by semicolons.
Again, the overlay
filter is more complicated, because it takes two inputs.
To apply this filter without any intermediate files, we have to split
the
input stream into two streams: (1) the cropped and blurred background [bg]
and (2) the scaled foreground [fg]
. The labels [bg]
and [fg]
are dummy
variables. We can name them anything we want using the [bracket]
syntax,
unlike the named constants such as ih
or main_h
.
In two distinct linear chains, the background [bg]
is cropped and blurred,
while the foreground [fg]
is scaled. Finally, foreground is layed over the
background.
Et voilà! Here’s what it looks like in action: https://www.tiktok.com/@jeff.irwin/video/7294441886876618030