Neural Compression of 360-Degree Equirectangular Videos using Quality Parameter Adaptation
Abstract
This study proposes a practical approach for compressing 360-degree equirectangular videos using pretrained neural video compression (NVC) models. Without requiring additional training or changes in the model architectures, the proposed method extends quantization parameter adaptation techniques from traditional video codecs to NVC, utilizing the spatially varying sampling density in equirectangular projections. We introduce latitude-based adaptive quality parameters through rate-distortion optimization for NVC. The proposed method utilizes vector bank interpolation for latent modulation, enabling flexible adaptation with arbitrary quality parameters and mitigating the limitations caused by rounding errors in the adaptive quantization parameters. Experimental results demonstrate that applying this method to the DCVC-RT framework yields BD-Rate savings of 5.2% in terms of the weighted spherical peak signal-to-noise ratio for JVET class S1 test sequences, with only a 0.3% increase in processing time.
Summary
This paper addresses the challenge of efficiently compressing 360-degree equirectangular videos using neural video compression (NVC) techniques. The core problem is that equirectangular projection introduces spatial distortion, particularly at higher latitudes, which can negatively impact compression performance if not addressed. The authors propose a practical solution that adapts quantization parameter (QP) adjustment techniques, commonly used in traditional video codecs, to the NVC domain. This is achieved without requiring retraining or modifying the architecture of a pre-trained NVC model (DCVC-RT). The method leverages the inherent relationship between latitude and spatial sampling density in equirectangular projections by introducing latitude-based adaptive quality parameters. These parameters are optimized through rate-distortion (RD) optimization. To overcome limitations caused by rounding errors in traditional QP adaptation, the method employs vector bank interpolation for latent modulation, allowing for flexible adaptation with arbitrary quality parameters. The key finding is that this approach yields significant compression gains for 360-degree videos. Specifically, the method achieves a 5.2% BD-Rate reduction (WS-PSNR) on JVET class S1 test sequences when applied to the DCVC-RT framework. Furthermore, this improvement comes at a negligible cost, with only a 0.3% increase in processing time. This is a valuable contribution because it offers a practical and efficient way to improve the compression of 360-degree videos using existing NVC models, avoiding the need for specialized training or architectural changes. This is particularly important as 360-degree video applications become more prevalent and demand efficient encoding and decoding solutions.
Key Insights
- •Novel Adaptation: The paper successfully adapts a well-established technique from traditional video coding (QP adaptation) to the relatively new field of neural video compression.
- •Latitude-Based QP Adjustment: It exploits the properties of the equirectangular projection format by adjusting the quality parameter based on latitude, acknowledging the varying sampling density.
- •Vector Bank Interpolation: The use of vector bank interpolation for latent modulation allows for continuous quality parameter adaptation, mitigating the quantization artifacts that can arise from discrete QP values.
- •Performance Improvement: The proposed method achieves a 5.2% BD-Rate reduction in terms of WS-PSNR on JVET class S1 sequences, which is a significant improvement in compression efficiency.
- •Computational Efficiency: The method incurs only a 0.3% increase in processing time, making it a practical solution for real-time 360-degree video compression.
- •LDP Configuration Stability: Unlike VVC, the proposed method achieves stable improvements across all sequences under the Low-Delay P (LDP) configuration.
- •Implicit Spatial and Temporal Quality Modulation: The adaptive quality adaptation method in DCVC-RT enables the simultaneous modulation of both spatial and temporal quality, resulting in stable performance gains in the LDP configuration.
Practical Implications
- •Improved 360-Degree Video Streaming: The method can be directly applied to improve the streaming quality and reduce the bandwidth requirements for 360-degree video content.
- •Beneficiaries: Content providers, video streaming platforms, and users of 360-degree video applications (e.g., VR, AR) would benefit from the improved compression efficiency.
- •Integration with Existing NVC Frameworks: Engineers can integrate the proposed QP adaptation technique into existing NVC frameworks like DCVC-RT with relative ease, as it doesn't require retraining or major architectural changes.
- •Scalable and Interoperable 360-Degree Video Services: The authors suggest future work on models that optimize encoding efficiency while maintaining backward compatibility with conventional videos, enabling scalable and interoperable 360-degree video services.
- •Future Research: The paper opens up avenues for further research into adaptive quality parameter control in NVC, particularly for different projection formats and distortion metrics relevant to immersive video.