Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames

Sanaz Nami; Farhad Pakdaman; Mahmoud Reza Hashemi; Shervin Shirmohammadi; Moncef Gabbouj

doi:10.36227/techrxiv.170421899.96243855/v1

loading page

Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames

Sanaz Nami,
Farhad Pakdaman,
Mahmoud Reza Hashemi,
Shervin Shirmohammadi,
Moncef Gabbouj

Abstract

The Just Noticeable Difference (JND) refers to the smallest distortion in an image or video that can be perceived by Human Visual System (HVS), and is widely used in optimizing image/video compression. However, accurate JND modeling is very challenging due to its content dependence, and the complex nature of the HVS. Recent solutions train deep learning based JND prediction models, mainly based on a Quantization Parameter (QP) value, representing a single JND level, and train separate models to predict each JND level. We point out that a single QPdistance is insufficient to properly train a network with millions of parameters, for a complex content-dependent task. Inspired by recent advances in learned compression and multitask learning, we propose to address this problem by (1) learning to reconstruct the JND-quality frames, jointly with the QP prediction, and (2) jointly learning several JND levels to augment the learning performance. We propose a novel solution where first, an effective feature backbone is trained by learning to reconstruct JNDquality frames from the raw frames. Second, JND prediction models are trained based on features extracted from latent space (i.e., compressed domain), or reconstructed JND-quality frames. Third, a multi-JND model is designed, which jointly learns three JND levels, further reducing the prediction error. Extensive experimental results demonstrate that our multi-JND method outperforms the state-of-the-art and achieves an average JND1 prediction error of only 1.57 in QP, and 0.72 dB in PSNR. Moreover, the multitask learning approach, and compressed domain prediction facilitate lightweight inference by significantly reducing the complexity and the number of parameters.

26 Dec 2023Submitted to TechRxiv

02 Jan 2024Published in TechRxiv

Abstract

Peer review timeline