Commit graph

162 commits

Author SHA1 Message Date
Timo Rothenpieler
9f3902f107 avcodec/nv{enc,dec}: use sane version checking macro
For some odd reason, the Nvidia version macros puts the minor version in
the msb, so comparing against it is impossible.
2018-04-13 11:19:43 +02:00
Timo Rothenpieler
2108a6736f avcodec/nvenc: update required driver versions for nvenc 2018-04-11 14:55:28 +02:00
Timo Rothenpieler
86e9dba8fa avcodec/nvenc: add support for B frames as ref 2018-04-11 14:55:28 +02:00
Philip Langdale
6a89cdc474 avcodec/nvenc: Declare support for P016
nvenc doesn't support P016, but we have two problems today:

1) We declare support for YUV444P16 which nvenc also doesn't support.
   We do this because it's the only pix_fmt we have that can
   approximate nvenc's internal format that is YUV444P10 with data in
   MSBs instead of LSBs. Because the declared format is a 16bit one,
   it will be preferrentially chosen when encoding >10bit content,
   but that content will normally be YUV420P12 or P016 which should
   get mapped to P010 and not YUV444P10.

2) Transcoding P016 content with nvenc should be possible in a pure
   hardware pipeline, and that can't be done if nvenc doesn't say it
   accepts P016. By mapping it to P010, we can use it, albeit with
   truncation. I have established that swscale doesn't know how to
   dither to 10bits so we'd get truncation anyway, even if we tried
   to do this 'properly'.
2018-03-02 14:52:48 -08:00
Timo Rothenpieler
932037c6bb avcodec/nvenc: also clear data pointer after unregistering a resource
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2018-01-28 13:05:09 +01:00
Timo Rothenpieler
48e52e4edd avcodec/nvenc: add some more error case checks
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2018-01-28 12:56:31 +01:00
Timo Rothenpieler
32bc4e77f6 avcodec/nvenc: unregister input resource when unmapping
Currently the resource is only ever unregistered when the
registered_frames array is fully in use and an unmapped entry is re-used
and cleaned up.
I'm pretty sure the frame will have been cleaned up before that happens,
so I'm kinda surprised this never blew up.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2018-01-28 12:39:06 +01:00
Timo Rothenpieler
bbe1b21022 avcodec/nvenc: refcount input frame mappings
If some logic like vsync in ffmpeg.c duplicates frames, it might pass
the same frame twice, which will result in a crash due it being
effectively mapped and unmapped twice.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2018-01-28 12:29:24 +01:00
Pan Bian
eb69e7bed8 avcodec/nvenc: set correct error code
In function process_output_surface(), the return value is 0 on the path
that av_mallocz() returns a NULL pointer. 0 indicates success, which
deviates from the fact. Return "AVERROR(ENOMEM)" instead of "0".

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-11-29 10:42:58 +01:00
Mark Thompson
1dc483a6f2 compat/cuda: Pass a logging context to load functions
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-11-20 15:47:05 +00:00
Timo Rothenpieler
4e93f00b06 avcodec/nvenc: check pop_context return value 2017-11-17 23:34:18 +01:00
Hendrik Leppkes
bff6d98ba3 nvenc: support d3d11 surface input 2017-11-15 10:35:44 +01:00
Hendrik Leppkes
6fcbf39f9e nvenc: factor context push/pop into functions
This reduces code repetition, and will allow adding further push/pop
refinement for D3D11 devices in future commits.
2017-11-15 10:35:39 +01:00
Timo Rothenpieler
d0961d3069 avcodec/nvenc: sanitize variable names 2017-09-07 12:08:32 +02:00
Timo Rothenpieler
a56d0497cb avcodec/nvenc: migrate to new encode API
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-09-07 12:08:32 +02:00
Timo Rothenpieler
4e6638abb4 avcodec/nvenc: always output picture timing SEI
Interlaced encoding profits from it, or might even need it in some
players.
No harm in enabling it unconditionally.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-09-02 16:01:57 +02:00
Timo Rothenpieler
0e995eac20 avcodec/nvenc: only push cuda context on encoder close if encoder exists
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-09-01 10:52:15 +02:00
Timo Rothenpieler
a0b69e2b0a avcodec/nvenc: add support for specifying entropy coding mode
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-09-01 10:52:15 +02:00
Ganapathy Kasi
43c417ac1a avcodec/nvenc: fix hw accelerated transcode with bframes
hw accelerated transcode (h264_cuvid -> h264_nvenc with -hwaccel cuvid) was
broken after the filtergraph initialization was changed to intialize decoder
first followed by encoder (commit af1761f7b5).
During initialzing encoder with bframes, local buffers are allocated
internally in encoder which fails since no cuda context is available. Now
pushing the correct cuda context before encoder initialization fixes the issue.
Also adding push/pop cuda ctx during create/destroy/map/unmap resources and
destroy encoder session.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-06-02 21:32:35 +02:00
Timo Rothenpieler
cb3358b68f avcodec/nvenc: print minimum driver version on error 2017-06-01 11:55:25 +02:00
Timo Rothenpieler
a1652aca7e avcodec/nvenc: remove unnecessary alignment
Fixes #6260
2017-05-23 11:24:43 +02:00
Sumit Agarwal
01775730fd avcodec/nvenc: add weighted prediction support
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-05-10 10:22:41 +02:00
Ben Chang
18a659d1b6 avcodec/nvenc: add fractional CQ support
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-05-10 10:21:25 +02:00
Timo Rothenpieler
cfbebe9dda avcodec/nvenc: deprecated old rc modes, add new ones 2017-05-09 18:38:30 +02:00
Timo Rothenpieler
23538ad2eb avcodec/nvenc: remove usage of deprecated fields 2017-05-09 18:38:30 +02:00
Timo Rothenpieler
f89a89c550 avcodec/nvenc: use frames hwctx when registering a frame 2017-05-07 13:38:30 +02:00
Timo Rothenpieler
dad6f44bbd avcodec/nvenc: support external context in sw mode 2017-05-07 13:35:25 +02:00
Ben Chang
8de3458a07 avcodec/nvenc: surface allocation reduction
This patch aims to reduce the number of input/output surfaces
NVENC allocates per session. Previous default sets allocated surfaces to 32
(unless there is user specified param or lookahead involved). Having large
number of surfaces consumes extra video memory (esp for higher resolution
encoding). The patch changes the surfaces calculation for default, B-frames,
lookahead scenario respectively.

The other change involves surface selection. Previously, if a session
allocates x surfaces, only x-1 surfaces are used (due to combination
of output delay and lock toggle logic). To prevent unused surfaces,
changing surface rotation to using predefined fifo.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-04-26 21:57:54 +02:00
Timo Rothenpieler
d84c2298e2 avcodec/nvenc: apply quantization factors to cqp 2017-03-23 17:10:52 +01:00
Timo Rothenpieler
7fb2a7afa1 avcodec/nvenc: Deprecate usage of global_quality, introducing qp 2017-03-23 17:10:52 +01:00
Clément Bœsch
b7cc4eb303 lavc/nvenc: misc cosmetics to reduce diff with Libav 2017-03-20 23:04:28 +01:00
Konda Raju
2db5ab73d4 avcodec/nvenc: allow different const-qps for I, P and B frames
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-03-17 10:42:55 +01:00
Konda Raju
5f44a4a0a9 avcodec/nvenc: add initial QP value options
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-03-01 13:15:34 +01:00
Ganapathy Raman Kasi
a549243b89 avcodec/nvenc: remove qmin and qmax constraints for vbr
qmin and qmax are not necessary for nvenc vbr.
Enforcing this constraint, doesn't allow user to use vbr 2 pass mode without explicity setting the qmin and qmax options

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-03-01 12:20:54 +01:00
Timo Rothenpieler
be74ba648c avcodec/nvenc: push cuda context before encoding a frame
Thanks to Miroslav Slugeň for figuring out what was going on here.
2017-02-14 11:24:13 +01:00
Timo Rothenpieler
8a3fea14ae avcodec/nvenc: set frame buffer format for mapped frames 2017-02-13 11:30:52 +01:00
Timo Rothenpieler
6b0a3ee6f8 avcodec/nvenc: add logging for more error cases 2017-01-20 10:29:36 +01:00
Timo Rothenpieler
5403d90f32 avcodec/nvenc: make gpu indices independend of supported capabilities 2017-01-20 10:29:36 +01:00
Miroslav Slugen
9b425bd24c avcodec/nvenc: Add bluray_compat basic implementation
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-01-01 14:47:25 +01:00
Miroslav Slugen
1841eda679 avcodec/nvenc: Make AUD optional for h264_nvenc and hevc_nvenc
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-01-01 14:37:09 +01:00
Miroslav Slugeň
f8c503d927 avcodec/nvenc: round qpIntra and qpInter calculation
Round qpIntra and qpInter calculation instead of old floor behavior.

Adopted from vaapi_encode_h264.c

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-01-01 14:34:42 +01:00
Ruta Gadkari
67db4ff3b6 NVENC: Update check for Lookahead
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-12-26 12:13:39 -03:00
Timo Rothenpieler
c2f3af57a5 avcodec/nvenc: mark intentional fall through 2016-11-30 12:36:23 +01:00
Miroslav Slugeň
f2dd6aee80 avcodec/nvenc: always reduce DAR width and height
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-11-30 12:36:23 +01:00
Philip Langdale
27038693bb avcodec/nvenc: Delay identification of underlying format of cuda frames
When input surfaces are cuda frames, we will not know what the actual
underlying format (nv12, p010, etc) is at surface allocation time.

On the other hand, we will know when the input frames are actually
registered and associated with a surface.

So, let's delay format discovery until registration time, which is
actually how we handle other frame properties, such as dimensions.

By itself, this change doesn't allow for transcoding of 10bit
content from cuvid, but it reduces the problem to the hardcoding of
the sw format in ffmpeg_cuvid.c

Signed-off-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-11-30 12:36:23 +01:00
Philip Langdale
829db8effd avcodec/nvenc: Remove aspect-ratio decompensation logic
This dubious behaviour in nvenc was finally removed by nvidia, and
as we refuse to run on anything older than 7.0, we don't need to
keep it around for old versions.
2016-11-25 10:13:58 -08:00
Miroslav Slugeň
de2faec2fa avcodec/nvenc: better surface allocation alghoritm, fix rc_lookahead
User selectable surfaces are not working correctly, if you set number of
surfaces on cmdline, it will always use minimum 32 or 48 depends on
selected resolution, but in nvenc it is not necessary to use so many
surfaces.

So from now you can define as low as 1 surface and nvenc will still
work, it will ofcourse lower GPU memory usage by 95% and async_delay to zero

That was the easy part, now littlebit more...

Next part of this patch is to always prefer rc_lookahead to be more
important for number of surfaces, than user defined surfaces value.
Maximum rc_lookahead from nvidia documentation is 32, but could increase
in future generations so there is no limit for this yet. Value
async_depth is still accepted and prefered over rc_lookahead.

There were also bug when you request more than rc_lookahead > 31, it
will always set maximum 31, because surface numbers recalculation was
after setting lookahead, which is now fixed.

Results:
If you set -rc_lookahead 32 and -bf 3 it will now use only 40 surfaces
and lower GPU memory usage by 20%, also it will now increase PSNR by 0.012dB

Two more comments:

1. from my internal test, i don't understand addition of 4 more surfaces
when lookahead is calculated, i didn't used this and everything works as
with those 4 more extra surfaces, does anybody know what is going on
there? I looks like it was used for B frames which are calculated
separately, because B frames maximum is 4.

2. rc_lookahead is defined default to -1, but in test condition if
(ctx->rc_lookahead) which sets lookahead it will be always true, i don't
know if this is intended behavior, so in default behavior is lookahead
always on!

This is default condition when rc_lokkahead is -1 (not defined on
cmdline), whis is maybe something that is not intended:
ctx->encode_config.rcParams.enableLookahead = 1;
ctx->encode_config.rcParams.lookaheadDepth  = 0;
ctx->encode_config.rcParams.disableIadapt   = 0;
ctx->encode_config.rcParams.disableBadapt   = 0;

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-11-22 10:34:27 +01:00
Timo Rothenpieler
a66835bcb1 avcodec/nvenc: use dynamically loaded CUDA 2016-11-22 10:34:27 +01:00
Matt Oliver
6ead033bca avcodec/nvenc.c: Use new safe dlopen code.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2016-11-05 18:09:03 +11:00
Sven C. Dack
da4d0fa86b avcodec/nvenc: add test for Temporal AQ support
Adds a check to see if the hardware supports temporal aq.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-10-19 12:41:41 +02:00