L2S-Talk by Prof. Babu (IISc Bangalore): Controllable and 3D-Aware Diffusion Models, and Leveraging Vision-Language Models for Domain Generalization

Monday 22 September, 14-15 h via ZOOM (link upon request)

The talk is part of the ZESS lecture series and hosted by the DFG research unit “Learning to Sense”(L2S).

Generative diffusion models have transformed image synthesis, with ongoing challenges in controllability, fairness, and 3D-awareness. We present recent advances that address these issues through methods for distribution-guided debiasing, fine-grained orientation and placement control, expressive multi-entity composition, and realistic visual effects such as mirror reflections. Complementing this line of work, we discuss a separate direction focused on domain generalization in image classification, where vision-language models such as CLIP are leveraged to transfer knowledge across distributions and improve robustness under shifts. Together, these two strands highlight progress both in expanding the controllability of generative models and in enabling generalizable recognition systems, pointing toward more adaptable and reliable visual AI.

Prof. Dr. R. Venkatesh Babu (Google Scholar, Webpage) is a full Professor and Chair of the Dept. of Computational and Data Sciences (CDS), IISc, where he leads the Vision and AI Lab (VAL) (formerly, Video Analytics Lab) at IISc. He received his Ph.D degree from the Dept. of Electrical Engineering, Indian Institute of Science, Bangalore. Thereafter, he held postdoctoral positions at NTNU, Norway, and IRISA/INRIA, Rennes, France, through an ERCIM fellowship. Subsequently, he worked as a research fellow at NTU, Singapore. He spent a couple of years working in the industry before returning to the institute in August 2010. His research interests include Computer Vision, Machine Learning, and Multimedia. He is a recipient of the SERB Star (2020) and Sathish Dhawan Young Engineer Awards (2019). He served as a Program Chair of the AIML-Systems 2023 conference. He serves as an associate editor for IEEE Trans. Image Processing (TIP), IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Pattern Recognition, and CVIU journals. He also serves as an area chair for the following major Vision and Machine Learning conferences: CVPR, AAAI, NeurIPS, ICLR, ICCV, WACV, ACML, AISTATS, and ACML. He has served as the co-session chair for the oral sessions at ICCV 2023 and ECCV 2024.

Jan
Jan

Head of Outreach and PR and coordinator of DFG Research Unit "Learning to Sense". ZESS staff photographer.

Articles: 99