Commit graph

6683 commits

Author SHA1 Message Date
Lynne
842fa198e9
hwcontext_vulkan: fix build with old Vulkan header versions 2025-05-21 03:11:07 +09:00
Lynne
eabb62813e
hwcontext_vulkan: only try exporting DMABUF memory on !WIN32 and only for DMABUF tiling 2025-05-20 19:53:02 +09:00
Lynne
7c3c5c8052
hwcontext_vulkan: correct image transfer usage flags
By pure coincidence, BUFFER and IMAGE flags were equal for those
two usage types.
2025-05-20 19:53:02 +09:00
Lynne
435db9bb49
vulkan: enable VK_KHR_shader_subgroup_rotate
Yet another thing that should've been always present.
2025-05-20 19:53:02 +09:00
Henrik Gramner
fd18ae88ae avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 8bpc inverse transforms 2025-05-19 15:56:27 +02:00
Ramiro Polla
b6803bf104 aarch64: increase default alignment for functions and constants
Use 16-byte alignment (align=4) instead of 4-byte (align=2) in the function and
const macros. This improves instruction fetch and NEON load performance on
modern AArch64 CPUs.
2025-05-19 13:20:51 +02:00
Michael Niedermayer
8c920c4c39
Remove libpostproc
Libpostproc will be available as source plugin at
https://github.com/michaelni/FFmpeg/tree/sourceplugin-libpostproc
OR
https://github.com/michaelni/libpostproc

whatever turns out more convenient to maintain

For the upcoming 8.0 release, libpostproc will be included, so as not to
cause delays or inconveniences

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-05-07 15:35:47 +02:00
Link Mauve
d5f4a55123 avutil/hwcontext_vulkan: Query the correct format
In the call to vkGetPhysicalDeviceImageFormatProperties2(), we were
previously requesting the properties of the first fallback format (e.g.
VK_FORMAT_R8_UNORM for VK_FORMAT_G8_B8R8_2PLANE_420_UNORM) instead of
the actual format in use.

We don’t do anything with it afterwards, but there is no reason to keep
querying the wrong format.
2025-05-07 15:16:58 +02:00
softworkz
7a05f57250 avutil/opt: Print default option value for AV_OPT_TYPE_INT64
Signed-off-by: softworkz <softworkz@hotmail.com>
2025-05-03 07:56:46 +02:00
Lynne
33a4d36101
hwcontext_vulkan: support AV_PIX_FMT_GBRP
Support was partially added previously in vulkan.c, but now it's fully
supported.
2025-05-01 09:34:44 +02:00
Lynne
37bd915042
vulkan: use _KHR suffix for push descriptor properties struct ID 2025-04-23 01:46:37 +02:00
Lynne
3bb2b8aff4
hwcontext_vulkan: enable subgroupSizeControl
We already use this feature for setting the subgroup size,
but this feature was not enabled.

Fixes a validation warning.
2025-04-22 13:43:20 +02:00
Lynne
96ddce1b3c
vulkan: move OPT_CHAIN out of hwcontext_vulkan
This allows for it to be shared.
Technically, implementations should not give drivers structs
that the drivers are not familiar with.
2025-04-22 13:43:19 +02:00
Lynne
cee34e0a55
vulkan: check that the max number of push descriptors is not exceeded
Just correctness. We don't exceed this on any known hardware, but
its better to check.
If we do, we simply fall back to regular descriptors.
2025-04-22 13:43:19 +02:00
Lynne
5098b1a345
vulkan: move feature<->usage mapping code outside of hwcontext_vulkan.c
Allows for it to be reused. In particular, for a future patch to make
vulkan hwaccels output DMABUF-backed VkImages.
2025-04-22 13:43:17 +02:00
softworkz
9e1162bdf1 avutil/hwcontext: Add item_name function for AVHWDeviceContext
Signed-off-by: softworkz <softworkz@hotmail.com>
2025-04-21 00:19:20 +02:00
softworkz
bf1579c904 avutil/log,hwcontext: Add AV_CLASS_CATEGORY_HWDEVICE
Signed-off-by: softworkz <softworkz@hotmail.com>
2025-04-21 00:19:11 +02:00
Lynne
7cd1edeaa4
vulkan: drop bgr_workaround
Vulkan's main issue around using BGR is simple.
The letters in the shader don't match up (rgba in shader, bgra in format).
So of course, rather than allowing "bgra" or other permutations of
formats in the shader, they went the nuclear option and spent months writing
an extension to get rid of the need to have a format in the shader to begin
with.

All this to solve a problem that should never have existed to begin with.
This fixes BGRA images since enabling WithoutFormat, as the GPU now remaps
without your involvement.
2025-04-19 18:45:13 +02:00
Lynne
ca6392e0a7
vulkan: always enable ReadWithoutFormat/WriteWithoutFormat
This implements support for reading and writing storage images with
no format.
The issue is that we define our images as arrays, and arrays can
only have a single type, which means that f.ex. NV12 needs two
different images, R8 and RG8.

The only driver known not to advertise support for the extension
as a whole is Intel, because they have parial support for odd formats
we never use. Therefore, just always enable it by default.
2025-04-19 10:59:11 +02:00
Lynne
bb3ce284d7 vulkan: use a single command buffer per command buffer pool
We violated the spec, which, despite the actual command buffer pool
*not* being involved in any functions which require external synchronization
of the pool, *require* external synchronization even if only the
command buffers are used.

This also has the effect of *significantly* speeding up execution
in case command buffers are contended.
2025-04-16 23:38:16 +02:00
James Almer
0e59675698 avutil/hwcontext_vulkan: use the typedef'd name for the expect_assume struct
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-15 16:52:51 -03:00
James Almer
f29475a89e avutil/hwcontext_vulkan: check if expect_assume is supported by the header
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-15 16:44:45 -03:00
Lynne
e040c087c7
vulkan: add support for expect/assume
This commit adds support for compiler hints.
While on AMD these are not used/needed, Nvidia benefits from them, and gives
a sizeable 10% speedup on 4k.
2025-04-14 06:10:43 +02:00
Lynne
66b8c92df2
vulkan_ffv1: cache only 2 lines when decoding RGB
This reduces the intermediate VRAM used for RGB decoding by a
factor of 100x for 6k video.
This also speeds the decoder up by 16% for 4k RGB24 and 31% for 6k video.

This is equivalent to what the software decoder does, but with less pointers.
2025-04-14 06:10:42 +02:00
Lynne
a1137f9214
hwcontext_vulkan: disable descriptor buffer extension on Intel
Temporary workaround. Will be replaced with a version check once a fix is
in the works and a known next version for Mesa with a fix is known.
2025-04-14 06:10:41 +02:00
Lynne
4f64df2928
vulkan: remove unused field from exec pools
This used to be involved in a mechanism to switch between queue indices,
but since the rewrite of the rewrite of the rewrite, it was rewritten out.
2025-04-14 06:10:40 +02:00
Lynne
11911aef46
vulkan_shaderc/glslang: print full shaders on TRACE rather than VERBOSE
Way too spammy.
2025-04-14 06:10:40 +02:00
Lynne
7b0156201b
vulkan: fix logging level when erroring upon creating shader module 2025-04-14 06:10:34 +02:00
Andreas Rheinhardt
516bcfc169 avutil/aes: Use #if checks instead of if (ARCH_X86)
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-04-13 22:47:32 +02:00
Andreas Rheinhardt
f81ace52f8 avutil/aes: Make aes_init_static() av_cold
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-04-13 22:47:26 +02:00
James Almer
a039726c2a avutil/x86/aes: remove a few branches
The rounds value is constant and can be one of three hardcoded values, so
instead of checking it on every loop, just split the function into three
different implementations for each value.

Before:
aes_decrypt_128_aesni:                                  93.8 (47.58x)
aes_decrypt_192_aesni:                                 106.9 (49.30x)
aes_decrypt_256_aesni:                                 109.8 (56.50x)
aes_encrypt_128_aesni:                                  93.2 (47.70x)
aes_encrypt_192_aesni:                                 111.1 (48.36x)
aes_encrypt_256_aesni:                                 113.6 (56.27x)

After:
aes_decrypt_128_aesni:                                  71.5 (63.31x)
aes_decrypt_192_aesni:                                  96.8 (55.64x)
aes_decrypt_256_aesni:                                 106.1 (58.51x)
aes_encrypt_128_aesni:                                  81.3 (55.92x)
aes_encrypt_192_aesni:                                  91.2 (59.78x)
aes_encrypt_256_aesni:                                 109.0 (58.26x)

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-10 12:02:34 -03:00
James Almer
aeed747f41 avutil/aes: use pthread_once to fill the static tables
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-09 08:44:50 -03:00
Andreas Rheinhardt
830fab6891 avutil/tests/channel_layout: Improve enum range check
Both GCC and Clang use unsigned as underlying type of
an enum with no negative enumeration constants, making
checks like "layout->order >= 0" here tautologically true.
Clang warns about this. Combine both range checks
by casting to unsigned to suppress this warning.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-04-09 11:45:14 +02:00
James Almer
1e5c65f539 avutil/dict: fix memleak in av_dict_set()
Regression since 19e9a203b7.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-07 23:38:09 -03:00
Michael Niedermayer
f3f1a48a07
APIChanges & version bump for AV_DICT_DEDUP
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-04-07 01:46:05 +02:00
rcombs
19e9a203b7
lavu/dict: add AV_DICT_DEDUP
This is useful when multiple metadata inputs may set the same value
(e.g. both a container-specific header and an ID3 tag).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-04-07 01:02:35 +02:00
James Almer
a35b4e8d29 avutil/x86/aes: ignore the upper bits in count
The argument is an int.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-06 11:02:09 -03:00
James Almer
3f30ae823e avutil/aes_ctr: simplify incrementing the counter
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:40 -03:00
James Almer
fe73b84879 avutil/aes_ctr: simplify and optimize av_aes_ctr_crypt()
Process data in chunks of four or eight bytes, depending on host, instead of
one at a time.

before:
55561 decicycles in av_aes_ctr_crypt

after:
52204 decicycles in av_aes_ctr_crypt

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:40 -03:00
Rodger Combs
2ea3c51795 lavu/aes: add x86 AESNI optimizations
crypto_bench comparison for AES-128-ECB:

lavu_aesni AES-128-ECB  size: 1048576  runs:   1024  time:    0.596 +- 0.081
lavu_c     AES-128-ECB  size: 1048576  runs:   1024  time:   17.007 +- 2.131
crypto     AES-128-ECB  size: 1048576  runs:   1024  time:    0.612 +- 1.857
gcrypt     AES-128-ECB  size: 1048576  runs:   1024  time:    1.123 +- 0.224
tomcrypt   AES-128-ECB  size: 1048576  runs:   1024  time:    9.038 +- 0.790

Improved-By: Henrik Gramner <henrik@gramner.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:40 -03:00
James Almer
2daaafafc6 avutil/tests/aes_ctr: also randomize the encryption key
And not just the IV.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:40 -03:00
James Almer
462d35dc72 avutil/tests/aes_ctr: reindent after the previous commit
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:40 -03:00
James Almer
19085287b4 avutil/tests/aes_ctr: also check the encrypted buffer
The test in its current form is just ensuring the plain text output is the same
as the plain text input, not bothering to check if anything was done with the
latter. av_aes_ctr_crypt() could be a simple memcpy under the hood and this
test would still succeed.

To check the integrity of the encrypted buffer, both the IV and the key need to
be fixed. As such, and in order to not remove the existing randomization of the
input IV, do two runs, one with random initialization data, and one with static
data.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:40 -03:00
James Almer
0a34f009aa avutil/tests/aes_ctr: test more than a single block worth of data
This should exercise the implementation more thoroughly after an upcoming
change.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-05 20:46:39 -03:00
Andreas Rheinhardt
1722f08acf avutil/Makefile: Only include half2float, float2half when needed
They are not needed for shared builds (and because --gc-sections
is not the default for shared builds, they were included by default
included in libavutil since bf22c4cc3e).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-04-03 06:04:57 +02:00
Andreas Rheinhardt
4d2e38b376 avutil/hwcontext_vulkan: Remove unused variable
Forgotten in 8c7b00ba3a.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-04-03 06:04:57 +02:00
llyyr
8c7b00ba3a avutil/hwcontext_vulkan: stop checking for deprecated and removed flag
AV_VK_FRAME_FLAG_CONTIGUOUS_MEMORY was deprecated in e0f2d2e702
and removed in 09a5760299

Fixes: e0f2d2e702
Fixes: 09a5760299
2025-03-29 00:40:48 +01:00
James Almer
b338d1b35b libs: bump major version for all libraries
Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-28 14:44:34 -03:00
Andreas Rheinhardt
0ccf385e13 avutil/float_dsp: Unavpriv avpriv_scalarproduct_float_c()
Not worth the overhead of exporting it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-28 14:33:08 -03:00
Andreas Rheinhardt
c389d9ac78 avutil/dict: Unavpriv avpriv_dict_set_timestamp()
And move it to lavf, its only user.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-28 14:33:08 -03:00