| PhD Seminar


Name of the Speaker: Ms. Joshitha R (EE19D701)
Guide: Dr. Mansi Sharma
Co-Guide: Dr. Kaushik Mitra
Venue: ESB-244 (Seminar Hall)
Online meeting link: https://meet.google.com/sgy-nibv-xbr
Date/Time: 7th May 2024 (Tuesday), 10:00 AM
Title: Efficient Representation, Coding and Streaming of Light field Videos for Glasses-free 3D Displays

Abstract

Applications of light fields for autostereoscopic or glasses-free displays have gained interest in the research community off-late since computational multi-view light field displays can enhance the viewing experience by providing a more immersive and realistic view. A richer 3D representation of a moving environment can be obtained from a dynamic light field or light field videos. However, unlike conventional videos, working with dynamic light fields necessitates additional storage and transmission requirements, and entails longer processing times. This results in higher data rates across all devices and services utilized for light field exchange and display. Thus, efficient representation and coding of light field videos capitalizing on the inherent redundancies in the spatial, angular, and temporal domains is necessary for streaming and display applications.

In this talk, I shall discuss data-driven algorithms that efficiently represent and encode dynamic light field data. An integrated deep learning network that synchronously applies aperture coding and pixel-wise exposure coding on light field video to produce a single acquired image within a single exposure time, will be presented. The underlying spatial, angular, and temporal correlations are effectively exploited by a data-driven dynamic mode decomposition (DMD) based approach on these learnt acquired images arranged as time snapshots. In addition, High Efficiency Video Coding (HEVC) removes intra-frame, inter-frame, and other intrinsic redundancies while maintaining reconstruction quality across various quantization parameters. Thus, the proposed scheme treats dynamic light fields as mathematical dynamical systems and leverages on dynamic modes of acquired images obtained via coded-aperture patterns. I will also present a comparison of the scheme with other data-driven methods (involving (a) block singular value decomposition in a Krylov subspace and (b) Tucker decomposition with tensor sketching) to encode light fields.

Overall, the algorithms exploit intra-view, inter-view, and other redundancies among light field views in spatial, angular and temporal directions to allow scalable light field video coding. The schemes provide flexible encoding of the input dynamic light field to satisfy diverse data rates for various approximation ranks and quantization parameters. Compression rates can be tailored to meet specific storage needs or bandwidths without compromising the quality of reconstructed light field video. The proposed scheme can be incorporated as a complement to other existing or future light field coding or video processing methods. The coding approaches are also applicable for light field streaming applications in augmented reality (AR), virtual reality (VR), or 3D platforms.