Le Blog Utux

HTTP 200 GET /

Windows + Kdenlive + NVENC - Part2: Benchmarks

Rédigé par uTux Aucun commentaire

How does profiles work ?

Kdenlive use ffmpeg and MLT. So it's basically MLT syntax to pass arguments to ffmpeg. But what about CBR, VBR, CQP, CRF? What's the magic behind these acronyms ?

  • CBR: Constant Bit Rate. Predictable bandwith, good for streaming. However, it's up to you to set the correct birate and it depends on the resolution of the video (720p, 1080p..) and its content (things that move fast will require more bandwith). A low value means bad quality while high means unnecessary huge files.
  • VBR: Variable Bit Rate. Set a nominal and a maximal bitrate settings that the encoder will use. This is similar to CBR except that the output file should be less huge.
  • CRF (Constant Rate Factor) and CQP (Constant Quantizer Parameter). I admit I do not get the difference between those two, but the idea is the same: you do not set the bitrate, but the quality level you want to achieve.

CBR and VBR are good for streaming because you need a predictable bandwith output. However when you need to locally record a video, you do not really care about the bandwith but more about the quality and the file size, in that case CRF / CQP are more suited.

Kdenlive Built-in CPU profiles:

  • x264 (CRF 23)
  • x265 (CRF 20)
  • vb9 (CRF 23)

Kdenlive Built-in GPU profiles:

  • NVENC H264 VBR (20-30Kbps)
  • NVENC H265 (CBR 30Kbps)
  • NVENC H264 VBR (20-30Kbps)
  • NVENC H265 (CBR 30Kbps)

I added those profiles:

  • NVENC H264 CQP 20: f=mp4 vcodec=h264_nvenc rc=constqp qp=20 profile=high preset=quality ab=192k ar=44100 acodec=aac bf=2
  • NVENC H265 CQP 20: f=mp4 vcodec=hevc_nvenc rc=constqp qp=20 profile=high preset=quality ab=192k ar=44100 acodec=aac bf=2

Benchmark

The input file was a 45min gaming session of Star Wars: Knights of the Old Republic, in 1920x1080 60fps + encoded in H264. I did not applied any effects except cutting video/audio.

Computer specifications:

  • AMD Ryzen 3700X (8c/16t @3,6GHz)
  • 16GB DDR4
  • Nvidia RTX3070 FE
  • 500GB NVMe SSD
  • Windows 10 x64
  • Kdenlive 20.12.13
Render time

While VP9 is a totally free fairly good codec, it is incredibly slow (3h13 versus 0h18 for x264). H265 is slower than H264 probably for good reasons although I'm not able to see a difference in output quality. H264 CQP 20 (NVENC) is 34% faster than x264 CRF 23 (CPU). Keep in mind x264 and x265 speed highly depends on the number of CPU cores. If you can get yours hands on a 3900X (12c/24t) or a 3950X (16c/32t) then I suspect they can match or outperform NVENC.

Render time

H264 CBR 30Kbps (NVENC) is the worst for file recording, resulting in a 12,56 GB file while you can achieve 3,2 GB using H264 CQP 20 (NVENC). x265 seems to be the most efficient (only 1,92 GB) but it's probably related to the CRF 20 agressive parameter. Of course I can't just fix all parameters to 20 because the output quality will depend on the codec, yeah it's black magic.

Limitations

Before jumping to conclusion, you need to be aware of some limitations of my benchmark:

  • The input file was encoded in H264, I did not tried anything else. This may explain the bad results for VP9.
  • x264 and x265 performance scale with the number of CPU cores.
  • I do not know if NVENC speed and quality depends on the GPU.
  • NVENC CQP 20 looks fine for me, but it's purely subjective. Another value might change the benchmark.
  • Kdenlive on Windows can't use NVENC to render video effects and will use the CPU, this is the reason why I did not add any.

Conclusion

I usually stick with H264 CQP 20 (NVENC) which is the fastest with a good quality output (at least to record video games) and a reasonable file size. While Nvidia recommends CQP 15, I can't tell the difference with CQP 20. Also remember that if you upload your video on Youtube, it will be reencoded in AV1 / VP9 with lower quality settings anyway.

Again, you have to understand that there is no "best encoder" for all situations. This is what I think works best for me, but it might be totally different for you.

Links

Windows + Kdenlive + NVENC - Part 1: Nvidia H264 & H265 Hardware encoders

Rédigé par uTux 4 commentaires

Kdenlive is a great software but I noticed a major drawback on the Windows version: render time that is really slow. For example, let's take a 45min H264 2560x1440 60fps video file and crop it to 1920x1080 ("Position and Zoom" effect on Kdenlive). Render time is about 1h15 on Kdenlive while it's only 15 minutes on Adobe Premiere Pro, the difference is insane. Why is the later so fast ? Spoiler: GPU rendering.

Logo kdenlive

Let's talk about how rendering works. Kdenlive use ffmpeg and MLT. The Windows version of Kdenlive is built with an embedded minimal ffmpeg that does not support GPU hardware acceleration, which is sad. Fortunately, it is possible to download the full version of ffmpeg and install it on Kdenlive, here is how to proceed.

Download

Installation

  • Run the kdenlive installer which is in fact a self-extracting archive. Extract the contents in a location, for example C:\Program Files\Kdenlive.
  • Extract ffmpeg-release-full-shared.7z to a temporary location, for example C:\Users\utux\Downloads.
  • Copy C:\Users\utux\Downloads\ffmpeg\bin\* to C:\Program Files\Kdenlive\bin\
  • Copy C:\Users\utux\Downloads\ffmpeg\presets\* to C:\Program Files\Kdenlive\share\ffmpeg\

Configuration

Start or restart Kdenlive then click Configuration > Run Config Wizzard. Check "Nvidia hardware acceleration" and make sure it is properly detected.

Kdenlive config wizard

The following render profiles show be available now:

  • NVENC H264 CBR
  • NVENC H264 VBR
  • NVENC H265 CBR
  • VAAPI Intel H264
  • VAAPI AMD H264

Try to render a project using one of these profiles and take a look at the Performance / GPU / Video Encode section in the Task Manager:

taskmgr

If the GPU Encode graph is low (20% or less) that means that Kdenlive is rendering Effects on the CPU (such as "Position and zoom"). See limitations below.

Limitations

  • Right now (Apr 2021) effects cannot be rendered by the GPU (at least on Windows). This may lead to frustrating situations where the GPU encoder only works at 20% and render is slow. I used to capture 1080p games in a 1440p desktop, then crop with the "Position and Zoom" effect, I changed that. Now I have set OBS Studio to record in 1080p, negating the need any transformation.
  • I could not make NVENC work for proxy clips, which is bad.
  • The Windows version of Kdenlive does not seems to support Movit aka "Experimental GPU processing" which add effects that can be rendered by the GPU. For example, instead of "Position and zoom" you can use "Pan and zoom (GPU)". You should try the Flatpak version (Linux).
Fil RSS des articles de ce mot clé