This is inspired by the equivalent dav1d attribute introduced
by Henrik Gramner in e4c4af02f3de5e6cea6f81272a2981c0fa7bae28.
Also already use it to beautify declarations.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This simplifies the code, reduces allocations, and critically, does
not store references of frames, along with references to hw_frames_ctx.
The issue was that storing refs to frames while transferring stored
refs to hw_frames_ctx of frames, and so created a circular dependency,
which caused the Vulkan device to never be terminated.
This only stores what it strictly needs as a dependency, and enables
the frames context to be freed, even while doing asynchronous transfers.
BGR formats in Vulkan cannot be used in storage images, as the
pixel labels on storage images are always ordered as RGB, and
swizzling is not an option due to old hardware limitations.
This means that you must always use an RGB format and manually
swizzle when reading or writing to BGR images, or simply not use
a format in the shader itself.
This adds support for the latter.
This enables users to specify a number that would be appended to
the buf_content string.
Saves users from needing to manually print to a string.
An earlier commit tried doing this via .elems, but it was
faulty, as this also incremented the total number of descriptors
in the descriptor set.
On a Zen 5, on Ubuntu 24.04 (with CLOCKS_PER_SEC 1000000), the
value of clock() in this loop increments by 0 most of the time,
and when it does increment, it usually increments by 1 compared
to the previous round.
Due to the "last_t + 2*last_td + (CLOCKS_PER_SEC > 1000) >= t"
expression, we only manage to take one step forward in this loop
(incrementing i) if clock() increments by 2, while it incremented
by 0 in the previous iteration (last_td).
This is similar to the change done in
c4152fc42e, to speed it up on
systems with very small CLOCKS_PER_SEC. However in this case,
CLOCKS_PER_SEC is still very large, but the machine is fast enough
to hit every clock increment repeatedly.
For this case, use the number of repetitions of each timer value
as entropy source; require a change in the number of repetitions
in order to proceed to the next buffer index.
This helps the fate-random-seed test to actually terminate within
a reasonable time on such a system (where it previously could hang,
running for many minutes).
Signed-off-by: Martin Storsjö <martin@martin.st>
amf_device_create() calls amf_device_uninit() on errors, but if things
were not initialized it will null deref amf_ctx->factory.
Fixes: https://github.com/mpv-player/mpv/issues/15814
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Adds hwcontext_amf, enabling a shared AMF context for encoders,
decoders, and AMF-based filters, without copy to the host memory.
Code also was tested in HandBrake.
Benefits:
- Optimizations for direct video memory access from CPU
- Significant performance boost in full AMF pipelines with filters
- Integration of GPU filters like VPP, Super Resolution, and
Compression Artefact Removal(in future plans)
- VCN power management control for decoders.
- Ability to specify which VCN instance to use for decoding
(like for encoder)
- AMD will soon introduce full AMF API for multimedia accelerator MA35D
- With AMF API, integration will be much easier:
GPU and the accelerator will have the same API
- including encoder, decoder, scaler, color converter,
Windows and Linux.
Learn more:
https://www.amd.com/en/products/accelerators/alveo/ma35d.html
Changes by versions:
v2: Header file cleanup.
v3: Removed an unnecessary class.
v4: code cleanup and improved error handling
v5: Fixes related to HandBrake integration.
v6: Sequential filters error and memory leak have been fixed.
This commit adds support for 4:2:2 pixel formats, namely NV16 and
P216 for NVIDIA GPUs.
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
This function can return AVERROR_BUG in theory if something
went wrong, but so can the caller, so we should propagate that
error message upward in that case.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
lavapipe recently added support for external_semaphore_fd, but only for syncfiles,
not for opaque file descriptors.
The code is written to allow using syncfiles later on.
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12422
It should be more clear what this flag is indicating with a more
verbose comment documenting it.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reviewed-by: Marton Balint <cus@passwd.hu>
Reviewed-by: Alexander Strasser <eclipse7@gmx.net>
Reviewed-by: Marth64 <marth64@proxyid.net>
The issue is that some compilers complain if a struct or array
is empty.
This extension does nothing by default, and can be useful, so just add it
to keep the array non-empty.
As discussed in the previous commit, we often need a convenient way of
stripping all side data related to a certain aspect of the frame. This helper
accomplishes just that.
I considered also adding a way to match only side data matching *all*
properties, but I think this is sufficiently useless in practise to not warrant
inclusion in the API.
Many filters modify certain aspects of frame data, e.g. through resizing
(vf_*scale* family), color volume mapping (vf_lut*, vf_tonemap*), or
possibly others.
When this happens, we should strip all frame side data that will no
longer be correct/relevant after the operation. For example, changing
the image size should invalidate AV_FRAME_DATA_PANSCAN because the crop
window (given in pixels) no longer corresponds to the actual image size.
For another example, tone-mapping filters (e.g. from HDR to SDR) should
strip all of the dynamic HDR related metadata.
Since there are a lot of different side data types that are affected by such
operations, it makes sense to establish this information in a common, easily
accessible way. The existing side data properties enum is a perfect fit for
this.
We recently introduced a public field which was a superset
of the queue context we used to have.
Switch to using it entirely.
This also allows us to get rid of the NIH function which was
valid only for video queues.
This code was simply incorrect through and through. It did not
protect what actually has to be protected in a multi-threaded setup.
Perhaps it was used to silence threading errors?
Either way, remove it, and document the correct way to use execution
pools in a threaded environment.
It's currently actually not used in MSVC builds, since
6e49b86996.
Older versions of MSVC (or, in particular, older versions of UCRT)
don't have stdalign.h; it's available since WinSDK 10.0.20348.0;
such a new enough version has been installed by default only since
MSVC 2022 17.4 and newer.
With this change, ffmpeg can still be built with MSVC 2019 16.8
(v19.28).
Signed-off-by: Martin Storsjö <martin@martin.st>
This can be useful in other places, e.g. it can replace objpool in
fftools.
The API is modified in the following nontrivial ways:
* opaque pointers can be passed through to all user callbacks
* read and write were previously separate callbacks in order to
accomodate the caller wishing to write a new reference to the FIFO and
keep the original one; the two callbacks are now merged into one, and
a flags argument is added that allows to request such behaviour on a
per-call basis
* new peek and drain functions