FFmpeg/libswscale
Shreesh Adiga e18f87ed9f swscale/x86/rgb2rgb: add AVX512ICL version of uyvytoyuv422
The scalar loop is replaced with masked AVX512 instructions.
For extracting the Y from UYVY, vperm2b is used instead of
various AND and packuswb.

Instead of loading the vectors with interleaved lanes as done
in AVX2 version, normal load is used. At the end of packuswb,
for U and V, an extra permute operation is done to get the
required layout.

AMD 7950x Zen 4 benchmark data:
uyvytoyuv422_c:                                      29105.0 ( 1.00x)
uyvytoyuv422_sse2:                                    3888.0 ( 7.49x)
uyvytoyuv422_avx:                                     3374.2 ( 8.63x)
uyvytoyuv422_avx2:                                    2649.8 (10.98x)
uyvytoyuv422_avx512icl:                               1615.0 (18.02x)

Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-02-18 12:43:57 -03:00
..
aarch64 swscale/aarch64/rgb2rgb_neon: Implemented {yuyv, uyvy}toyuv{420, 422} 2025-02-17 11:39:42 +02:00
arm swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00
loongarch loongarch: fixes fate-checkasm-sw_rgb failure 2025-01-15 01:27:36 +01:00
ppc swscale/ppc: disable YUV2RGB AltiVec acceleration 2024-12-02 02:51:39 +01:00
riscv swscale/range_convert: saturate output instead of limiting input 2024-12-05 21:10:29 +01:00
tests tests/swscale: allow nonzero positive return codes from sws_scale_frame() 2024-12-18 17:30:48 +01:00
x86 swscale/x86/rgb2rgb: add AVX512ICL version of uyvytoyuv422 2025-02-18 12:43:57 -03:00
alphablend.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00
bayer_template.c swscale/internal: constify SwsFunc 2024-10-07 19:51:34 +02:00
cms.c swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
cms.h swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
csputils.c swscale/csputils: add internal colorspace math helpers 2024-12-23 12:33:43 +01:00
csputils.h swscale/csputils: add internal colorspace math helpers 2024-12-23 12:33:43 +01:00
gamma.c swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
graph.c swscale/graph: copy scaler_params to the legacy subpass context 2025-02-07 13:17:37 -03:00
graph.h swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
half2float.c swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
hscale.c swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats 2024-12-05 21:10:29 +01:00
hscale_fast_bilinear.c swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
input.c swscale: 16bit planar float input support 2025-01-21 21:06:14 +01:00
libswscale.v build: Change structure of the linker version script templates 2016-05-29 16:43:11 +02:00
log2_tab.c lsws: duplicate ff_log2_tab 2014-08-12 20:52:21 +02:00
lut3d.c swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
lut3d.h swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
Makefile swscale/lut3d: add 3DLUT dispatch system 2024-12-23 12:33:43 +01:00
options.c swscale/options: add -sws_dither none alias 2024-12-23 12:47:10 +01:00
output.c swscale/output: Fix undefined overflow in yuv2rgba64_full_X_c_template() 2025-01-08 23:23:24 +01:00
rgb2rgb.c swscale/swscale_unscaled: add unscaled x2rgb10le to packed RGB 2024-11-06 17:34:32 -03:00
rgb2rgb.h swscale/swscale_unscaled: add unscaled x2rgb10le to packed RGB 2024-11-06 17:34:32 -03:00
rgb2rgb_template.c swscale/swscale_unscaled: add unscaled conversion for AYUV/VUYA/UYVA 2024-11-02 15:01:31 -03:00
slice.c swscale/slice: fix init of 32 bpc planes 2024-12-16 12:21:55 +01:00
swscale.c swscale/swscale: don't reject scaling when color parameters are not supported but conversion is not required 2025-01-22 12:15:18 -03:00
swscale.h swscale: add ICC intent enum and option 2024-12-23 12:33:43 +01:00
swscale_internal.h swscale: use 16-bit intermediate precision for RGB/XYZ conversion 2024-12-26 20:31:36 +01:00
swscale_unscaled.c swscale: 16bit planar float input support 2025-01-21 21:06:14 +01:00
swscaleres.rc
utils.c swscale: 16bit planar float input support 2025-01-21 21:06:14 +01:00
utils.h swscale/swscale: don't reject scaling when color parameters are not supported but conversion is not required 2025-01-22 12:15:18 -03:00
version.c lib*/version: Use static_assert for static asserts 2024-03-31 00:08:42 +01:00
version.h swscale: add ICC intent enum and option 2024-12-23 12:33:43 +01:00
version_major.h libs: bump major version for all libraries 2024-03-07 11:29:43 -03:00
vscale.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00
yuv2rgb.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00