| MS Seminar


Name of the Speaker: Mr. Kuppa Sai Sri Teja (EE23S042)
Guide: Dr. Kaushik Mitra
Online meeting link: http://meet.google.com/fqv-dmdi-akv
Date/Time: 7th May 2025 (Tuesday), 3:30 PM
Title: Smarter Surveillance Systems with AI-Driven Data Diversity and Novel Imaging for Robust 3D Capture.

Abstract :

Modern surveillance systems are indispensable across domains such as industrial safety, public infrastructure, and healthcare. However, challenging real-world conditions often hinder their effectiveness, including low-light environments, motion blur, lens flare, and extreme illumination. These visual degradations directly impact the performance of downstream AI tasks such as person tracking and person re-identification. This thesis addresses these challenges on two fundamental fronts: (i) improving the diversity of training data using generative AI, and (ii) leveraging a novel imaging modality to enhance visual input under challenging conditions.

To enhance the diversity of the dataset, we present DIVA, which makes two key contributions: (1) high-resolution traditional Indian garments are collected on a large scale, (2) the development of a diffusion-based virtual try-on model that improves diversity by utilizing scribble maps and pose-aware images as priors. DIVA enhances person-centric models, particularly in contexts where real-world annotated data is limited or difficult to acquire. From the perspective of improving 3D reconstruction, we focus on leveraging a novel camera system. We present PhotonSplat, which is a neural rendering pipeline designed to reconstruct scenes under extremely low-light and motion-degraded conditions using Single-Photon Avalanche Diode (SPAD) sensors. While SPAD sensors can capture ultra-fast photon events, they inherently suffer from high binary noise. PhotonSplat mitigates this by applying spatially-aware 3D filtering, preserving scene structure while reducing noise. Additionally, our work integrates video colorization models and severely blurred reference images to colorize reconstructed 3D scenes, thus enabling practical use in detection and editing applications. To support these efforts, we also introduce PhotonScenes, a real-world multi-view dataset captured with SPAD arrays under adverse visual conditions. By combining synthetic data generation with sensor-level image reconstruction, this work contributes toward the development of better surveillance systems for complex operational environments.