Commit graph

21275 commits

Author SHA1 Message Date
Vittorio Giovara
ff9db5cfd1 lavc: Use a stricter check for the color properties values
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-02 11:36:42 -05:00
Diego Biurrun
0a35f128f3 cabac: x86: Give optimizations header a more meaningful name 2016-12-01 08:23:54 +01:00
Martin Storsjö
cad42fadcd aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

vp9_inv_dct_dct_16x16_sub16_add_neon:   1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon:   8089.0

By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     235.3
vp9_inv_dct_dct_16x16_sub2_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     555.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon:    5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon:   6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon:   6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon:   7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon:   7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon:   8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8098.8

I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:57:05 +02:00
Martin Storsjö
9c8bc74c2b arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:54:07 +02:00
Martin Storsjö
3c87039a40 arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:53:52 +02:00
Clément Bœsch
c4c5f5386c vp9dsp: add DC only versions for idct/idct.
before:

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m11.125s
user    0m11.059s
sys     0m0.050s

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m10.944s
user    0m10.819s
sys     0m0.064s

after:

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m8.153s
user    0m8.034s
sys     0m0.050s

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m8.038s
user    0m7.980s
sys     0m0.039s

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:48:28 +02:00
Diego Biurrun
e4382a4ab4 hevc: Eliminate pointless variable indirection 2016-11-30 14:11:44 +01:00
Diego Biurrun
5c89022542 hevc: Drop pointless av_unused attribute 2016-11-30 14:11:43 +01:00
Diego Biurrun
0983f9117f metasound: Drop unused tables 2016-11-30 13:44:05 +01:00
Diego Biurrun
212c6a1d70 mjpegdec: Check return values of functions that may fail 2016-11-29 13:13:35 +01:00
Diego Biurrun
3ee5f25d37 dxva2: Adjust printf length modifiers where appropriate 2016-11-29 13:13:35 +01:00
Anton Khirnov
3fe2a01df7 lavc: move decoding-related code from utils.c to a new file 2016-11-29 10:39:20 +01:00
Anton Khirnov
328cd2b599 lavc: move encoding-related code from utils.c to a new file 2016-11-29 10:39:20 +01:00
James Almer
45d199d5b0 aac_adtstoasc_bsf: validate and forward extradata if the stream is already ASC
Fixes AAC AudioSpecificConfig passthrough.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-29 10:39:20 +01:00
Andreas Cadhalpun
1762a39e09 mss2: only use error correction for matching block counts
This fixes a heap-buffer-overflow in ff_er_frame_end when decoding mss2
with coded_width/coded_height larger than width/height.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2016-11-29 10:38:01 +01:00
Diego Biurrun
eb135516e6 ac3enc: Avoid unnecessary macro indirections 2016-11-28 17:19:30 +01:00
Diego Biurrun
f0d3e43bd7 ac3enc: Reshuffle functions to avoid forward declarations 2016-11-28 17:19:30 +01:00
Diego Biurrun
e22c63ac74 ac3enc: Reshuffle some float/fixed-mode ifdefs to avoid a dummy function 2016-11-28 17:19:30 +01:00
Anton Khirnov
4adbb44ad1 tta: avoid undefined shifts
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-11-25 21:42:33 +01:00
Anton Khirnov
dc4b625028 tta: use get_unary() instead of a custom implementation
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-11-25 21:42:33 +01:00
Martin Storsjö
2f99117f6f aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-24 13:39:21 +02:00
Alexandra Hájková
178b4ea5f9 xsubdec: Convert to the new bitstream reader 2016-11-24 11:22:13 +01:00
Alexandra Hájková
be35ef92a4 xan: Convert to the new bitstream reader 2016-11-24 11:22:13 +01:00
Alexandra Hájková
f9c59f26c8 wnv1: Convert to the new bitstream reader 2016-11-24 11:22:13 +01:00
Alexandra Hájková
0536e7d782 vima: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
e5bdfc6790 vble: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
104a4289f9 utvideodec: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
85f760fedd twinvq: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
0bea79afa6 tscc2: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
8e4cadea5d truespeech: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
0ac07d0b8d tiertex: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
9ab1a3e283 truemotion2: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
9f78e3a46d svq1dec: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
6efbc88a5c smacker: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
087bc8d704 sipr: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
f26cbb555b rtjpeg: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
c60cda7cb4 ra288: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
7d8075cf47 ra144: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Martin Storsjö
79566ec8c7 arm: vp9itxfm: Rename a macro parameter to fit better
Since the same parameter is used for both input and output,
the name inout is more fitting.

This matches the naming used below in the dmbutterfly macro.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:56:56 +02:00
Martin Storsjö
721bc37522 arm/aarch64: vp9itxfm: Fix indentation of macro arguments
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:56:16 +02:00
James Almer
aa498c3183 avpacket: fix leak on realloc in av_packet_add_side_data()
If realloc fails, the pointer is overwritten and the previously allocated buffer
is leaked, which goes against the expected functionality of keeping the packet
unchanged in case of error.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-23 13:17:52 +01:00
Andreas Cadhalpun
f92d7bdfdd libopusdec: default to stereo for invalid number of channels
This fixes an out-of-bounds read if avc->channels is 0.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-23 13:03:15 +01:00
Diego Biurrun
b34c6cd57a dvbsub: cosmetics: Group all debug code together 2016-11-23 07:40:46 +01:00
Diego Biurrun
b8cd7a3c8d dvbsub: Check for errors from system()
libavcodec/dvbsubdec.c:145:5: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result]
libavcodec/dvbsubdec.c:148:5: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result]
2016-11-23 07:36:32 +01:00
Diego Biurrun
6427379f23 als: Restructure DEBUG ifdefs to avoid unused function parameter warnings 2016-11-22 17:28:17 +01:00
Diego Biurrun
367f95af55 ac3enc: Restructure DEBUG ifdefs to avoid unused function parameter warnings 2016-11-22 17:28:17 +01:00
Diego Biurrun
81a3c42abe Drop some bogus Doxygen documentation. 2016-11-21 14:29:11 +01:00
Diego Biurrun
a1d9de304f Fix some mismatches between function parameter and doxygen parameter names. 2016-11-21 14:29:10 +01:00
Martin Storsjö
4d960a1185 aarch64: vp9itxfm: Use w3 instead of x3 for the int eob parameter
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-18 23:17:33 +02:00
Janne Grunau
e5b0fc170f arm: vp9itxfm: Simplify the stack alignment code
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-18 23:17:26 +02:00