Change #252791
| Category | ffmpeg |
| Changed by | Andreas Rheinhardt <andreas.rheinhardt@outlook.com> |
| Changed at | Fri 19 Dec 2025 20:55:37 |
| Repository | https://git.ffmpeg.org/ffmpeg.git |
| Project | ffmpeg |
| Branch | master |
| Revision | 6368d2baaea8121f9fa23fb40edb5308690d699d |
Comments
avcodec/x86/lossless_videodsp: Don't store in eight byte chunks Use movu (movdqu) instead of movq+movhps. Old benchmarks: add_left_pred_int16_c: 2265.5 ( 1.00x) add_left_pred_int16_ssse3: 595.4 ( 3.81x) add_left_pred_rnd_acc_c: 1255.0 ( 1.00x) add_left_pred_rnd_acc_ssse3: 326.2 ( 3.85x) add_left_pred_rnd_acc_avx2: 279.0 ( 4.50x) add_left_pred_zero_c: 1249.5 ( 1.00x) add_left_pred_zero_ssse3: 326.1 ( 3.83x) add_left_pred_zero_avx2: 277.0 ( 4.51x) New benchmarks: add_left_pred_int16_c: 2266.9 ( 1.00x) add_left_pred_int16_ssse3: 509.9 ( 4.45x) add_left_pred_rnd_acc_c: 1251.4 ( 1.00x) add_left_pred_rnd_acc_ssse3: 282.6 ( 4.43x) add_left_pred_rnd_acc_avx2: 208.9 ( 5.99x) add_left_pred_zero_c: 1253.7 ( 1.00x) add_left_pred_zero_ssse3: 280.0 ( 4.48x) add_left_pred_zero_avx2: 206.8 ( 6.06x) The checkasm test has been modified to use an unaligned destination for this test. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/lossless_videodsp.asm