| PhD Viva


Name of the Speaker: Mohit Lamba (EE18D009)
Guide: Dr. Kaushik Mitra
Online meeting link: meet.google.com/dau-obru-yxp
Date/Time: 25th November 2022, 4.00pm
Title: Fast and Efficient Restoration of Dark Multi-View Images

Abstract

Low-light image enhancement has been an actively researched area for decades and has produced excellent night-time single-image restoration methods. Years of research has produced state-of-the-art algorithms exploiting techniques ranging from histogram equalization to retinex theory and more recently, convolutional neural networks for low-light enhancement. Despite these advances, the existing literature on low-light enhancement has two major limitations: a) current methods by design are limited to single image enhancement, even though the plight of low-light conditions is equally shared by all optical systems and b) no existing method can restore extremely dark night-time images captured in near-zero lux conditions with a reasonable computational budget. Addressing these problems can endow night-time capabilities to several devices/applications such as smartphones and self-driving cars.In this thesis, we present deep learning architectures for restoring extreme low-light images captured in different modalities, namely -- single/monocular image restoration, stereo image restoration and Light Field restoration. A practical low-light solution must also respect constraints such as limited GPU memory and processing power and should strike a balance between network latency, memory utilization, model parameters, and reconstruction quality. Existing methods, however, only target restoration quality and compromise on speed and memory requirements, raising concerns about their real-world usability.Our models are exceptionally lightweight, remarkably fast, and produce a restoration that is perceptually at par with state-of-the-art computationally intense models.For monocular image restoration, we do most of the processing in the higher scale-spaces, skipping the intermediate-scales wherever possible. Also unique to our model is the potential to process all the scale-spaces concurrently, offering an additional 30% speedup without compromising the restoration quality. Pre-amplification of the dark raw-image is an important step in extreme low-light image enhancement. Most of the existing state of the art methods need GT exposure value to estimate the pre-amplification factor, which is not practically feasible. Thus, we propose an amplifier module that estimates the amplification factor using only the input raw image and can be used off-the-shelf with pre-trained models without any fine-tuning. We show that our model can restore an ultra-high-definition 4K resolution image in just 1sec. on a CPU and at 32fps on a GPU and yet maintain a competitive restoration quality. We also show that our proposed model, without any fine-tuning, generalizes well to subsequent tasks such as object detection. We also propose a light-weight and fast hybrid U-net architecture for low-light stereo image enhancement. In the initial few scale-spaces, we process the left and right features individually because the two features do not align well due to large disparity. At coarser scale-spaces, the disparity between left and right features decreases and the networks receptive field increases. We use this fact to reduce computations by simultaneously processing the left and right features, which also benefits epipole preservation. As our architecture does not use any 3D convolution for fast inference, we use an Epipole-Aware loss module to train our network. This module computes quick and coarse depth estimates to better enforce the epipolar constraints. Extensive benchmarking in terms of visual enhancement and downstream depth estimation shows that our architecture not only performs significantly better but also offers 4-60 x speed-up with 15-100 x lower floating point operations, suitable for real-world applications.To facilitate learning-based techniques for low-light LF imaging, we collected a comprehensive LF dataset of various scenes. For each scene, we captured four LFs, one with near-optimal exposure and ISO settings and the others at different levels of low-light conditions varying from low to extreme low-light settings. We also propose the L3F-wild dataset that contains LF captured late at night with almost zero lux values. Existing single-frame low-light enhancement techniques do not harness the geometric cues present in different LF views and so lead to either blurry or too noisy restorations. Hence we propose deep neural network architectures for LFs. Our networks not only perform visual enhancement of each LF view but also preserve the epipolar geometry across views. We achieve this by extracting both global and view-specific features and later appropriately fusing them using our RNN-inspired feedforward network. Our LF networks can also be used for low-light enhancement of single-frame images, despite they being engineered for LF data. We do so by proposing a transformation to convert any single-frame DSLR image into a pseudo-LF. This allows the same architecture to be used for both LF and single-image low-light enhancement. With all the above advantages intact, our latest LF based neural network offers considerable speed-up with a significantly lower memory footprint.