Streaming with ffmpeg is most certainly a thing, and can be very useful in a lot of different scenarios. We'll go over some of the basics, what does what, pitfalls and platform considerations.
In this first example we are just going to stream a video file using x264 and taking basic considerations.
I'm going to write this split up into lines that go together in order to not overwhelm more than necessary. I will do my best to explain the basics about what it does, and why. I'm often using PowerShell to do these things, so when you see the
` symbol, that is just telling PS that we are continuing on the next line as one long command. I'll be sure to add the full one liner at the bottom of the post.
ffmpeg -re -i input.mp4 ` -c:v libx264 -preset veryfast ` -b:v 6M -maxrate 6M -bufsize 6M ` -r 30 -g 60 ` -x264-params "no-scenecut=1" ` -nal-hrd cbr ` -pix_fmt yuv420p ` -c:a aac -b:a 128k -ac 2` -f flv rtmp://strim.otterbro.com/myStream?superSecretKey
So, starting at line 1, we're calling ffmpeg and then we are using the
-re parameter to tell FFMPEG that we want it to read the input file at native frame rate (speed), placement matters here (before the input). If we try to stream a file without doing this, then ffmpeg will just plow ahead as fast as it can, encoding and sending the stream way faster than what the streaming platform will be expecting. This should not be used when you are reading from a device (dshow etc).
ffmpeg -re -i input.mp4
We're still on line 1, and we are now providing the file we want to stream (our source), which we do using
-i and then the name of the file. If you are not currently in the same folder as your file you are going to have to specify the path (C:\users\chris\video\myVid.mp4)
-c:v libx264 -preset veryfast
Yay, line 2, we're doing great!
c:v libx264 tell ffmpeg what encoder (type video) we're looking to use. libx264 is the x264 encoder, ie h264/avc running on software (cpu).
-preset tells ffmpeg what we want to use a specific preset, and here we've chosen "veryfast".
-b:v 6M -bufsize 6M
We are now specifying the bitrate we would like to use thru
b:v. The reason it looks like this is because its Bitrate:Video. We dont want ffmpeg thinking its audio bitrate (best to be specific). What we want to do isnt really implemented (CBR, constant bitrate), so we are kind of tricking VBR (variable bitrate) into just having a very smaller buffer using
-bufsize (resulting in vbv-buffsize being set). The safest would be to set it to 6M (same as bitrate), but for most HLS/DASH platforms, you might be able to get away with 2x bitrate (some quality gain to be had here.). This will create a slightly bigger fluctuations in bitrate. If viewers are struggling with buffering, consider setting it to the same as the bitrate instead.
-r 30 -g 60
We chose a framerate using
-r 30, which results in 30fps. Dosent have to be 30, but if you dont specify, you are at the mercy of the source files framerate, which might not be what you think it is, or even worse, could potentially have variable framerate (not good!).
-g 60 is 2x our framerate, and sets a fixed/closed GOP, resulting in a keyframe every 2 seconds. This is a good default for both latency and quality. We want it to be fixed so we dont mess with platforms segmenters and we get even chunks and timing.
Looks pretty scary, but we need this stuff in order to ensure a lot of HLS (and probably DASH) streaming platforms work properly.
x264-params lets us talk directly to the encoder (x264) in order to set a few parameters. The reason we want to disable
scenecut is because it would vary the GOP length, and this would potentially cause problems for the streaming platform. This is intentionally a very short version of what is going on, might explain more in a separate article.
Remember when I said CBR doesn't really exist with this encoder? This helps use get close to that by padding our video feed with "filler" data if necessary to achieve pseudo CBR (along with our other parameters). We do this thru the use of
nal-hrd cbr. Short version is that we want bandwidth consistency, so that viewers of the stream will know immediately if they can handle the feed, instead of waiting for a high motion/action moment to start buffering.
This is to make sure that our stream is compliant with most platforms and devices.
pix_fmt sets the "pixel format", in which we've chosen YUV encoding using 4:2:0 chroma subsampling. This might not make a ton of sense unless you have prior knowledge about these things, and that is okay. Color formats, space and ranges is a gigantic field of its own, so let's not worry about it for now, just know that it is important, and that we want YUV 420.
-c:a aac -b:a 128k -ac 2
This finally brings along something nice and easy. We need some audio to go with the stream (at least in this case), and we just specify the audio codec thru
c:a aac, much like with video, but this time it's "a" instead of "v" to be explicit with ffmpeg, and we've chosen AAC as our codec. We set the bitrate to 128kbps using
-ac 2 is there to make sure we're only sending stereo, just incase the video file has some sort of surround audio (5.1+), as that will cause issues. Nice and easy :)
-f flv rtmp://strim.otterbro.com/myStream?secretKey
Final line, and this is where we actually send the stream somewhere. In this scenario we are doing good ole RTMP thru the use of
-f flv rtmp://.... If you wanted to use something else here like SRT, HLS or others, slight modifications would need to be made, but it's not too difficult. Make sure you're certain of the address, port, path and stream key.
Congratulations, you've made it. I'm sure this was a bit more encompassing than one would imagine, and this is honestly the most basic form while still taking care of the basic pitfalls. This should work for most things, and was written with HLS in mind as well, as one would assume there is a decent chance that is the form of delivery. I believe this would work for DASH as well, but I don't personally have a lot of knowledge about DASH, so take it with a grain of salt.
I've done my best to take most platforms into consideration, and this should fly with Twitch, Youtube and I imagine most platforms with a reasonable bitrates, codec (both video and audio), taking care of consistent and reasonable keyframes (which will be used for splitting/segmenting chunks thru HLS/DASH), as well as making the feed quite consistent with reasonable buffer size and padding.
Thank you for your time, and I hope you learned something new.
Until next time, have a glorious day :)
Reminder, you can always reach out to me if you spot an error, inconsistency or poorly written/explained concept or topic. Contact form is here