Donald Feury

videoediting

Coming soon


So recently I decided to try adding some background music to my videos so it isn't just me blathering the whole time. Naturally, I turn to my weapon of choice in these cases, ffmpeg.

The music I use for this is from a project called StreamBeats, you should check it out.

Select Random Music

For the first part, I wanted to pick a random music file from a directory and use it later. For this, I wrote a simple script:

#!/usr/bin/env sh

dir=$1
find "$dir" | shuf -n 1 > tracklist.txt

head tracklist.txt
  1. I pass in the directory to use
  2. I list out of contents of that directory using find
  3. I pipe that output to shuf, which randomizes the list of files. With the -n 1 flag, it will output only the first line
  4. I write the output of all that to a text file for reference later, as well as using head to list that file to stdout

Add Music

Time for the ffmpeg magic:

#!/usr/bin/env sh

video=$1
bgm_dir=$2
output=$3
bgm="$(random_music "$bgm_dir")"

ffmpeg -i "$video" -filter_complex \
    "amovie=$bgm:loop=0,volume=0.03[bgm];
    [0:a][bgm]amix[audio]" \
    -map 0:v -map "[audio]" -shortest -c:v copy "$output"
  1. I have three arguments here
    • The video to add the music to
    • The directory you want to pick the random music from
    • The path to write the new file to
  2. We get the music file to load in using the random_music script and save it for later
  3. I'll talk about the important parts of this ffmpeg command
    • amovie=$bgm:loop=0,volume=0.03[bgm]; – this loads the randomly chosen music file to make its audio stream available and with the loop argument set to 0, loops it indefinitely. The volume filter is used to adjust the volume of the music to be more “background music” appropriate
    • [0:a][bgm]amix[audio] – combines the audio from the video and the newly loaded background music into one audio stream
    • -shortest tells ffmpeg to stop writing data to the file when the shortest stream ends, which, in this case is our video stream. The audio stream technically never ends since it loops forever.

Tada, you should have a new version of your video with the randomly chosen music looping for the duration of the video.

#ffmpeg #videoediting

YouTube Video

I recently updated my python script for automatically removing the silence parts of a video.

Previously, I had to call the shell script separate to generate the silence timestamps.

Now, the python script grabs the output of the shell script directly using subprocess run.

Script

#!/usr/bin/env python

import sys
import subprocess
import os
from moviepy.editor import VideoFileClip, concatenate_videoclips

input_path = sys.argv[1]
out_path = sys.argv[2]
threshold = sys.argv[3]
duration = sys.argv[4]

try:
    ease = float(sys.argv[5])
except IndexError:
    ease = 0.2

minimum_duration = 1.0

def generate_timestamps(path, threshold, duration):
    command = "detect_silence {} {} {}".format(input_path, threshold, duration)
    output = subprocess.run(command, shell=True, capture_output=True, text=True)
    return output.stdout.split('\n')[:-1]


def main():
    count = 0
    last = 0
    timestamps = generate_timestamps(input_path, threshold, duration)
    print("Timestamps: {}".format(timestamps))
    video = VideoFileClip(input_path)
    full_duration = video.duration
    clips = []

    for times in timestamps:
        end,dur = times.strip().split()
        print("End: {}, Duration: {}".format(end, dur))

        to = float(end) - float(dur) + ease

        start = float(last)
        clip_duration = float(to) - start
        # Clips less than one seconds don't seem to work
        print("Clip Duration: {} seconds".format(clip_duration))

        if clip_duration < minimum_duration:
            continue

        if full_duration - to < minimum_duration:
            continue


        print("Clip {} (Start: {}, End: {})".format(count, start, to))
        clip = video.subclip(start, to)
        clips.append(clip)
        last = end
        count += 1

    if not clips:
        print("No silence detected, exiting...")
        return


    if full_duration - float(last) > minimum_duration:
        print("Clip {} (Start: {}, End: {})".format(count, last, 'EOF'))
        clips.append(video.subclip(last))

    processed_video = concatenate_videoclips(clips)
    processed_video.write_videofile(
        out_path,
        fps=60,
        preset='ultrafast',
        codec='libx264',
        audio_codec='aac'
    )

    video.close()


main()

I won't go over this in full detail, as I did that in the last post about the silence trimming script. I will break down the changes I made.

For a break down of the scripts in more detail, check out the last post I made about it.

{% link https://dev.to/dak425/automatically-trim-silence-from-video-with-ffmpeg-and-python-2kol %}

Changes

def generate_timestamps(path, threshold, duration):
    command = "detect_silence {} {} {}".format(input_path, threshold, duration)
    output = subprocess.run(command, shell=True, capture_output=True, text=True)
    return output.stdout.split('\n')[:-1]

Here I created a function to pass in the arguments needed by the detect silence script, and execute it using subprocess.run.

It needs the capture_output=True to actually save the output, and text=True to make the output be in the form of a string, otherwise its returned as raw bytes.

I then split on the newlines and remove the last entry, as its an empty string that is not needed.

Since we are grabbing the script output straight from stdout, we no longer have to open and read an arbitrary text file to get the timestamps.

One last change was, before I was adding a padding to the start of the next clip, to make the transitions less abrupt. Now I add it the end of the last clip, as it seems more natural.

if not clips:
        print("No silence detected, exiting...")
        return

I also added this sanity check to make sure there were actually clips generated, can't concatenate clips that don't exist.

Thats it! Now I can remove the silence parts of a video by calling only script! It also avoids having to create the intermittent timestamp file as well.

#ffmpeg #python #videoediting

YouTube Video

I had someone email me asking how to solve the following problem:

I would like to take video A, video B, and replace the audio in video A with the audio from video B

The approach they were trying was as follows:

  1. Extract only the video from video A
  2. Extract only the audio from video B, while also transcoding it a codec he needed it to be in
  3. Merge these two files together

Now, this approach is fine but he encountered an issue. He claimed to need the audio in the a WAV file format but the WAV format wasn't compatible with the codec he needed to transcode the audio into.

So what does he do?

I showed him you can do all this in one command, avoiding this file format issue, while also not creating the intermittent files.

Let me show you the example I showed him and I will break it down.

VIDEO=$1
AUDIO=$2
OUTPUT=$3

ffmpeg -an -i $VIDEO -vn -i $AUDIO -map [0:v] -map [1:a] -c:v copy -c:a pcm_s8_planar $OUTPUT

VIDEO=$1

This is the file he wants to use the video stream from, so in his case its video A.

AUDIO=$2

This is the file he wants to use the audio from, making this video B.

OUTPUT=$3

The file path to save the combined result to.

-an -i $VIDEO

The -an option before the input means ignore the audio stream. This will give us only the video stream for this file. It also speeds up the command by avoiding having to read the audio stream.

-vn -i $AUDIO

The -vn option before the input means ignore the video stream. This will give us only the audio stream for this file. It also speeds up the command by avoiding having to read the video stream.

-map [0:v] -map [1:a]

The -map options make it so that we explicitly tell ffmpeg what streams of data to write out to the output, instead of it just figuring it out. This may have not been needed but I'd rather be explicit when I need to be.

-c:v copy -c:a pcms8planar $OUTPUT

The -c:v copy option makes so ffmpeg just copies the video stream over, avoiding a decode and re-encode. This makes it really fast.

the -c:a pcms8planar option transcodes the audio stream to the codec he needed it to be in.

Lastly, we just tell ffmpeg to use the output path given

aaannnddddd...

Drum roll please...

It worked like a charm! He was very happy to be able to continue with his project.

#ffmpeg #videoediting

YouTube Video

Even though I'm only about one month into my YouTube channel, I thought some people might interested about how I go about creating and uploading the videos and thumbnails.

To summarize the video above, the process goes like this:

  1. Record the video using OBS, usually only involves one file but there are a few times I needed to start and stop again.

  2. If multiple recordings were done, concat them together using the demuxer concat method.

  3. Find the timestamp in seconds in my video for the plug. (a.k.a like, sub, blah blah)

  4. Run the video through my finalize video script to add in the fade in and out, overlay the sub animation, and append the outro.

  5. While its processing, take dumb snapshot with webcam.

  6. Edit snapshot in Gimp

    • Do a little color correction and probably brighten the image.
    • Crop it down to 1280x720, keeping my face to the right side of the image.
    • Add text to the left side of the image, usually some variant of the video title.
    • Put a box behind the text to give them some contrast.
    • Apply an images if applicable. (ex. ffmpeg logo, YouTube logo)
    • ... Thats it, usually takes like five minutes.
  7. Quickly check over video once its done processing.

  8. Upload if it looks good.

  9. Do the usual SEO stuff (tags, description, title)

  10. Add thumbnail exported from Gimp

  11. Once the video is processed on YouTube, add the end screen where the fade in starts.

  12. Add any cards if applicable, such as references to other videos. I always add the relevant playlist to the start of the video.

  13. Content!

Thats it for now. Once I get consistent lighting in my office, I'm gonna add proper color correcting to the finalization script.

#videoediting #productivity

Odysee YouTube


I finally did it, I managed to figure out a little process to automatically remove the silent parts from a video.

Let me show ya'll the process and the two main scripts I use to accomplish this.

Process

  1. Use ffmpeg's silencedetect filter to generate output of sections of the video's audio with silence
  2. Pipe that output through a few programs to get the output in the format that I want
  3. Save the output into a text file
  4. Use that text file in a python script that sections out the parts of the video with audio, and save the new version with the silence removed

Now, with the process laid out, lets look at the scripts doing the heavy lifting.

Scripts

Here is the script for generating the silence timestamp data:

#!/usr/bin/env sh

IN=$1
THRESH=$2
DURATION=$3

ffmpeg -hide_banner -vn -i $IN -af "silencedetect=n=${THRESH}dB:d=${DURATION}" -f null - 2>&1 | grep "silence_end" | awk '{print $5 " " $8}' > silence.txt

I'm passing in three arguments to this script: * IN – the file path to the video I want to analyze

  • THRESH – the volume threshold the filter uses to determine what counts as silence

  • DURATION – the length of time in seconds the audio needs to stay below the threshold to count as a section of silence

That leaves us with the actual ffmpeg command:

ffmpeg -hidebanner -vn -i $IN -af “silencedetect=n=${THRESH}dB:d=${DURATION}” -f null – 2>&1 | grep “silenceend” | awk '{print $5 “ ” $8}' > silence.txt

  • -hide_banner – hides the initial dump of info ffmpeg shows when you run it

  • -vn – ignore the input file's video stream, we only need the audio and ignoring the video stream speeds up the process alot as it doesn't need to demux and decode the video stream.

  • -af "silencedetect=n=${THRESH}dB:d=${DURATION}" – detects the silence in the audio and displays the ouput in stdout, which I pipe to other programs

The output of silencedetect looks like this: Silencedetect Example Output

  • -f null - 2>&1 – do not write any streams out and ignore error messages. To keep the output as clean as possible

  • grep "silence_end" – we first pipe the output to grep, I only want the lines that have that part that says “silence_end”

  • awk '{print $5 " " $8}' > silence.txt – Lastly, we pipe that output to awk and print the fifth and eighth values to a text file

The final output looks like this:

86.7141 5.29422
108.398 5.57798
135.61 1.0805
165.077 1.06485
251.877 1.11594
283.377 5.21286
350.709 1.12472
362.749 1.24295
419.726 4.42077
467.997 5.4622
476.31 1.02338
546.918 1.35986

You might ask, why did I not grab the silence start timestamp? That is because those two numbers I grabbed were the ending timestamp and the duration. If I just subtract the duration from the ending timestamp, I get the starting timestamp!

So finally we get to the python script that processes the timestamps. The script makes use of a python library called moviepy, you should check it out!

#!/usr/bin/env python

import sys
import subprocess
import os
import shutil
from moviepy.editor import VideoFileClip, concatenate_videoclips

# Input file path
file_in = sys.argv[1]
# Output file path
file_out = sys.argv[2]
# Silence timestamps
silence_file = sys.argv[3]

# Ease in duration between cuts
try:
    ease = float(sys.argv[4])
except IndexError:
    ease = 0.0

minimum_duration = 1.0

def main():
    # number of clips generated
    count = 0
    # start of next clip
    last = 0

    in_handle = open(silence_file, "r", errors='replace')
    video = VideoFileClip(file_in)
    full_duration = video.duration
    clips = []
    while True:
        line = in_handle.readline()

        if not line:
            break

        end,duration = line.strip().split()

        to = float(end) - float(duration)

        start = float(last)
        clip_duration = float(to) - start
        # Clips less than one seconds don't seem to work
        print("Clip Duration: {} seconds".format(clip_duration))

        if clip_duration < minimum_duration:
            continue

        if full_duration - to < minimum_duration:
            continue

        if start > ease:
            start -= ease

        print("Clip {} (Start: {}, End: {})".format(count, start, to))
        clip = video.subclip(start, to)
        clips.append(clip)
        last = end
        count += 1

    if full_duration - float(last) > minimum_duration:
        print("Clip {} (Start: {}, End: {})".format(count, last, 'EOF'))
        clips.append(video.subclip(float(last)-ease))

    processed_video = concatenate_videoclips(clips)
    processed_video.write_videofile(
        file_out,
        fps=60,
        preset='ultrafast',
        codec='libx264'
    )

    in_handle.close()
    video.close()

main()

Here I pass in 3 required and 1 optional argument:

  • file_in – the input file to work on, should be the same as the one passed into the silence detection script

  • file_out – the file path to save the final version to

  • silence_file – the file path to the file generated by the silence detection

  • ease_in – a work in progress concept. I noticed the jumps between the clips is kinda sudden and too abrupt. So I want to add about half a second of padding to when the next clip is suppose to start to make it less abrupt.

You will see there is a minimum_duration, that is because I found in testing that moviepy will crash when trying to write out a clip that is less than a second. There are a few sanity checks using that to determine if a clip should be extracted yet or not. That part is very rough still though.

I track when the next clip to be written out should start in the last variable, to track when the last section of silence ended.

The logic for writing out clips works like so:

  • Get the starting timestamp of silence

  • Write out a clip from the end of the last section of silence, until the start of the next section of silence, and store it in a list

  • Store the end of the next section of silence in a variable

  • Repeat until all sections of silence are exhausted

Last we write the remainder of the video to the last clip, use the concatenate_vidoeclips function from moviepy to pass in a list of clips and combine them into one video clip, and call the write_videofile method of VideoClip class to save the final output to the out path I passed into the script.

Tada! You got a new version of the video with the silent parts removed!

I will try to show a before and after video of the process soon.

#ffmpeg #python #videoediting

Odysee YouTube


Something I had to figure how to do recently to make a Fiverr gig promo video was stack videos together.

This is when you see two or more videos playing side by side at the same time. It is often used to compare before and after results, such as what I did.

Horizontal Stacking

So what does it look like to stack two videos together horizontally? It looks something like this:

Horizontal Stack Example

Lets see how we achieve this result:

ffmpeg -i video1.mp4 -i video2.mp4 -filter_complex \
"[0:v][1:v]hstack=inputs=2:shortest=1[outv]" \
-map "[outv]" hstacked.mp4

You'll see we are passing in two videos as inputs with the -i option, video1.mp4 and video2.mp4.

For the filter graph we have:

“[0:v][1:v]hstack=inputs=2:shortest=1[outv]”

We are taking the video streams from the two inputs and passing them into the hstack filter. The inputs option indicates how many video streams are being used as inputs (defaults to 2) and the shortest option indicates how long the output video stream will be. By default, it will be the length of the longest video stream. Setting shortest=1 will make it the length of the shortest video stream instead.

After that, we just map the video stream created from hstack to the output file and you're good to go.

One thing about using hstack taken from the ffmpeg filter documentation:

All streams must be of same pixel format and of same height

If I recall, this means the videos have to be the same height and have the same encoding. Otherwise, the output just doesn't work.

Vertical Stacking

Lets see what vertically stacked videos looks like:

Vertically Stacked Videos

Lets see how we achieve this result:

ffmpeg -i video1.mp4 -i video2.mp4 -filter_complex \
"[0:v][1:v]vstack=inputs=2:shortest=1[outv]" \
-map "[outv]" hstacked.mp4

This is almost exactly the same as horizontal stacking but we use vstack instead of hstack, even the arguments are the same.

The vstack filter has the same conditions as hstack but they have to be the same width, instead of height.

Combining Stacks

A weird idea I had after playing around with these was to combine them. After doing so I got this result:

Double Stacked Videos

Thats pretty interesting, looks like a 2x2 grid of videos playing.

Now, how did we achieve this effect?

ffmpeg -i video1.mp4 -i video2.mp4 -i video3.mp4 -filter_complex \
"[0:v][1:v]hstack=inputs=2:shortest=1[row1];
[0:v][2:v]hstack=inputs=2:shortest=1[row2];
[row1][row2]vstack=inputs=2:shortest=1[outv]" \
-map "[outv]" ow-creation-double-stack.mp4

Lets go over the filter graph:

[0:v][1:v]hstack=inputs=2:shortest=1[row1]

Here we are horizontally stacking the first video stream and the second video stream and calling the new stream [row1].

[0:v][2:v]hstack=inputs=2:shortest=1[row2]

Next, we horizontally stack the first video stream with the third video stream and call that new stream [row2].

[row1][row2]vstack=inputs=2:shortest=1[outv]

Finally, we take the two horizontally stacked video streams, and vertically stack them on top of each other! That is pretty neat.

With that, you should be able to horizontally and vertically stack videos with ffmpeg!

Thank you for reading!

#ffmpeg #videoediting

YouTube Video

If you've been watching my videos from the posts I make on here (thank you if you have), I got a dirty little secret...

I haven't done any manual editing on the past three videos.

Now, that doesn't mean there isn't any editing, there is a little but that will grow in time.

No, instead a single ffmpeg command does all the editing for me.

So what exactly is it doing you ask? So far it does the following:

  • Delaying and overlaying a sub animation over the main video (usually just a recording)
  • Adding a fade in effect to the start of the video
  • Adding a fade out effect before the outro
  • Appends the outro to the end of the video

Oh, whats that? You want to see the magic? I got you fam.

Script

#!/usr/bin/env sh

IN=$1
OUT=$2
OVER=$3
OVER_START=$4
OUTRO=$5
DURATION=$(get_vid_duration $IN)
FADE_OUT_DURATION=$6
FADE_IN_DURATION=$7
FADE_OUT_START=$(bc -l <<< "$DURATION - $FADE_OUT_DURATION")
MILLI=${OVER_START}000

ffmpeg -i $IN -i $OUTRO -filter_complex \
    "[0:v]setpts=PTS-STARTPTS[v0];
    movie=$OVER:s=dv+da[overv][overa];
    [overv]setpts=PTS-STARTPTS+$OVER_START/TB[v1];
    [v0][v1]overlay=-600:0:eof_action=pass,fade=t=in:st=0:d=$FADE_IN_DURATION,fade=t=out:st=$FADE_OUT_START:d=$FADE_OUT_DURATION[mainv];
    [overa]adelay=$MILLI|$MILLI,volume=0.5[a1];
    [0:a:0][0:a:1][a1]amix=inputs=3:duration=longest:dropout_transition=0:weights=3 3 1[maina];
    [mainv][maina][1:v][1:a]concat=n=2:v=1:a=1[outv][outa]" \
    -map "[outv]" -map "[outa]" $OUT

Thats a chunky boy, so let me break down what exactly is happening.

Arguments

IN=$1
OUT=$2
OVER=$3
OVER_START=$4
OUTRO=$5
DURATION=$(get_vid_duration $IN)
FADE_OUT_DURATION=$6
FADE_IN_DURATION=$7
FADE_OUT_START=$(bc -l <<< "$DURATION - $FADE_OUT_DURATION")
MILLI=${OVER_START}000

These are all the arguments I'm passing into the script to build the ffmpeg command.

  • IN=$1 – this is the path to the main video that I want to use, probably a recording I did earlier in the day.

  • OUT=$2 – this is the path I want to save the final video to.

  • OVER=$3 – this is the file path to the subscription animation I started using. I thought it better to pass this in, since I may change what animation I'm using at some point.

  • OVER_START=$4 – the timestamp, in seconds, to start playing the subscription animation in the main video. Its needed to offset the animation's video frame timestamps and delay the audio.

  • DURATION=$(get_vid_duration $IN) – I'm using another script to get the duration, in seconds, of the main video. Its using ffprobe to grab the metadata in a specific format.

Here is the getvidduration script for reference:

#!/usr/bin/env sh

IN=$1

ffprobe -i $IN -show_entries format=duration -v quiet -of csv="p=0"
  • FADE_OUT_DURATION=$6 – the duration in seconds of the fade out effect. It is also used to calculate the starting time of the fade out effect.

  • FADE_IN_DURATION=$7 – same as last but for the fade in effect.

  • FADE_OUT_START=$(bc -l <<< "$DURATION - $FADE_OUT_DURATION") – uses the duration and fade out duration to calculate the exact second to start the fade out effect. Passed into a terminal calculator program called bc.

  • MILLI=${OVER_START}000 – The milliseconds version of the overlay animation duration. One of the filters I use needs milliseconds instead of seconds.

Filtergraph

"[0:v]setpts=PTS-STARTPTS[v0];
movie=$OVER:s=dv+da[overv][overa];
[overv]setpts=PTS-STARTPTS+$OVER_START/TB[v1];
[v0][v1]overlay=-600:0:eof_action=pass,fade=t=in:st=0:d=$FADE_IN_DURATION,fade=t=out:st=$FADE_OUT_START:d=$FADE_OUT_DURATION[mainv];
[overa]adelay=$MILLI|$MILLI,volume=0.5[a1];
[0:a:0][0:a:1][a1]amix=inputs=3:duration=longest:dropout_transition=0:weights=3 3 1[maina];
[mainv][maina][1:v][1:a]concat=n=2:v=1:a=1[outv][outa]"
  • [0:v]setpts=PTS-STARTPTS[v0]; – this is making sure that the main video's video stream is starting at the same 00:00:00 timestamp as the animation for proper offsetting. This might not be necessary but I'd rather make sure.

  • movie=$OVER:s=dv+da[overv][overa]; – loading in the sub animation's video and audio stream to be available for use in the rest of the filtergraph.

  • [overv]setpts=PTS-STARTPTS+$OVER_START/TB[v1]; – offset the sub animation's timestamps by the OVER_START argument.

  • [v0][v1]overlay=-600:0:eof_action=pass – overlay the sub animation's video stream over the main video stream with an offset on the x position of -600 (bumps it over to the left).

  • fade=t=in:st=0:d=$FADE_IN_DURATION – adds a fade in effect at the start of the video stream, with duration of FADEINDURATION.

  • fade=t=out:st=$FADE_OUT_START:d=$FADE_OUT_DURATION[mainv]; – adds a fade out effect at the end of the video stream, starting at FADEOUTSTART and lasting FADEOUTDURATION.

  • [overa]adelay=$MILLI|$MILLI – adds a delay of MILLI milliseconds to the audio of the sub animation's audio, to sync it up with video stream that was offset.

  • volume=0.5[a1]; – the sub animation's little ding sound is kinda loud, so I cut its volume in half.

  • [0:a:0][0:a:1][a1]amix=inputs=3:duration=longest:dropout_transition=0:weights=3 3 1[maina]; – we mix in both audio streams from the main video, and the audio stream from the sub animation together into one stream. Duration says to set the combined stream's length to the length of the stream with the longest input. Dropout transition and weights are used to offset the increase in volume that occurs when the sub animation sound ends. Its not perfect but it helps.

  • [mainv][maina][1:v][1:a]concat=n=2:v=1:a=1[outv][outa] – finally we take processed video and audio streams, and concat on the end of them, the video and audio streams of the outro passed into script. I just use a blank screen with some music playing for now.

Output

-map "[outv]" -map "[outa]" $OUT

Finally, we map the fully processed video and audio stream to the output file. This way, ffmpeg will write those streams out to the file, instead of the unprocessed streams straight from the input files.

With that, we have successfully:

  • [x] Overlaid the sub animation, at the desired time, in the main video.
  • [x] Added a fade in effect to the start of the video.
  • [x] Added a fade out effect to the end of the video.
  • [x] Concatenated the outro to the end of the video after the fade out effect.

Things I would like to add:

  • [] Color correction – Hard to do right now since I don't have consistent lighting in my office.
  • [] Better Outro – Something instead of a blank screen with music.
  • [] Get an Intro – Get a decent intro to add to the start of the video.

#ffmpeg #videoediting #productivity

YouTube Video

This one was a doozy to figure out but I finally managed to overlay a little subscribe animation over my main videos.

Normally I would explain it but its pretty long winded so if you want the full explanation, please watch the video.

The short answer is, its using a complex filtergraph to offset the timestamps on the frames of the sub animation and delay its audio so they play over the main video when I wanted them to.

The command is as follows:

#!/usr/bin/env sh

IN=$1
OVER=$2
OUT=$3
START=$4
MILLI=${START}000

ffmpeg -i $IN -filter_complex \
"[0:v]setpts=PTS-STARTPTS[v0];
movie=$OVER:s=dv+da[overv][overa];
[overv]setpts=PTS-STARTPTS+$START/TB[v1];
[v0][v1]overlay=-600:0:eof_action=pass[out1];
[overa]adelay=$MILLI|$MILLI,volume=0.5[a1];
[0:a:0][0:a:1][a1]amix=inputs=3:duration=longest:dropout_transition=0:weights=3 3 1[outa]" \
-map "[out1]" -map "[outa]" $OUT

#ffmpeg #videoediting

Odysee YouTube


I have found it very useful to concatenate multiple video files together after working on them separately. It turns out, that is rather simple to do with ffmpeg.

How do we do this?

There are three methods I have found thus far:

  • Using the concat demuxer approach

    • This method is very fast as is avoids transcoding
    • This method only works if the files have the same video and audio encoding, otherwise artifacts will be introduced
  • Using file level concatenation approach

    • There are some encodings that support file level concatenation, kinda like just using cat on two files in the terminal
    • There are very few encodings that can do this, the only one I've used the is MPEG-2 Transport Stream codec (.ts)
  • Using a complex filtergraph with the concat filter

    • This method can concat videos with different encodings
    • This will cause a transcoding to a occur, so it takes time and may degrade quality
    • The syntax is hard to understand if you've never written complex filtergraphs before for ffmpeg

Lets look at the examples, first the concat demuxer approach:

ffmpeg -f concat -i list.txt -c copy out.mp4

Unlike most ffmpeg commands, this one takes in a text file containing the files we want to concatenate, the text file would look something like this:

file 'video1.mp4'
file 'video2.mp4'

The example for the file level concatenation would look like this:

ffmpeg -i "concat:video1.ts|video2.ts" -c copy out.ts

and the last example would be like so:

ffmpeg -i video1.mp4 -i video2.flv -filter_complex \
"[0:v][0:a][1:v][1:a] concat=n=2:v=1:a=1 [outv] [outa]" \
-map "[outv]" -map "[outa]" out.mp4

This one is probably pretty confusing, so let me explain the complex filtergraph syntax:

Unlike using filters normally with ffmpeg using -vf or -af, when using a complex filtergraph, we have to tell ffmpeg what streams of data we are operating on per filter.

At the start you see:

[0:v][0:a][1:v][1:a]

This translates in plain english to:

Use the video stream of the first input source, use the audio stream from the first input source, use the video stream from the second input source, and use the audio stream from the second input source.

The square bracket syntax indicates:

[indexofinput:stream_type]

Those of us with experience in programming will understand why the index starts at 0 and not 1

Now after we declared what streams we are using, we have a normal filter syntax:

concat=n=2:v=1:a=1

concat is the name of the filter

n=2 is specifying there are two input sources

v=1 indicates each input source has only one video stream and to write only one video stream out as output

a=1 indicates each input source has only one audio stream and to write only one audio stream out as output

Next, we label the streams of data created by the filter using the bracket syntax:

[outv] [outa]

Here, we are calling the newly created video stream outv and the audio stream outa, we need these later when using the -map flag on the output

Lastly, we need to explicitly tell ffmpeg what streams of data to map to the output being written to the file, using the -map option

-map “[outv]” -map “[outa]”

That names look familiar? Its what we labeled the streams created from the concat filter. We are telling ffmpeg:

Don't use the streams directly from the input files, instead use these data streams created by a filtergraph.

And with that, ya let it run and tada, you have concatenated two videos with completely different encodings, hurray!

#ffmpeg #videoediting