This is a small change, but it does have a big impact on bit allocation.
all the regressions marked in the report have no audible
difference (I didn't check them all though), but the improvements can
be heard.
This affects mostly high bit rates. It's related to issue #2686.
In the report, A is the patched version, B is unpatched, all
comparisons show deltas in the form (A-B), so a positive pSNR delta
means a better quality in the patched version, and negative a
regression. Regressions are only considered for pSNR deltas below
-1db, they're considered serious below -6db.
All measurements were done with tiny_psnr.
The summary of the report inline for quick reading:
Files: 58
Bitrates: 6
Tests: 347
Serious Regressions: 0 (0%)
Regressions: 10 (2%)
Improvements: 54 (15%)
Big improvements: 26 (7%)
Worst regression - sine_tester.flac - 384k
- StdDev: 1.68 pSNR: -3.05 maxdiff: -178.00
Best improvement - 07 - Bound.flac - 384k
- StdDev: -1700.05 pSNR: 20.64 maxdiff: -29595.00
Average - StdDev: -55.67 pSNR: 1.20 maxdiff: -1593.00
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Some files produced by the official encoder have up to 16bit of
padding instead of the expected padding to the byte.
Use a self-explanatory macro instead of a simple number.
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
9127 -> 8936 decicycles (fate-suite/vc1/SA10143.vc1)
13855 -> 10976 decicycles (fate-suite/vc1/SA20021.vc1)
tests done by the author over this function but with the whole
patchset applied not just this commit
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Several encoders were multiplying the buffer size by 8, in order to get
a bit size. However, the buffer_size argument is for the byte size of
the buffer. We had experienced crashes encoding prores (Anatoliy) at
size 4096x4096.
Change register constraint on the v variable from = to +. This was causing GCC
to think that the v variable was never read and therefore not initialize it.
This fixes about 20 fate failures on mips64el.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The float_copy and fmul_and_reverse functions are refactored out from the
multiple copies in this file.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The optimized C version of this code actually runs faster than this
version, so remove it.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Remove some assembly that the compiler can easily handle optimally on its own.
GCC produces almost identical assembly.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Q_fract should have be declared as 'const float*'.
Also fix the constness of some local variables affected by this.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
GCC is perfectly happy generating optimized multiplication code on its own for
64-bit arches. GCC refuses to optimize the loongson code when in 32-bit mode,
so I've left that.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>