Publications

What Variables Affect Out-of-Distribution Generalization in Pretrained Models?

Vision language models fail to translate detailed visual features into words

Vision language models are blind

Very Short and Accurate Explanations by Design

Unmasking the functionality of early layers in VLMs

Towards Safer and Understandable Driver Intention Prediction

Toward a Principled Theory of XAI via Spectral Analysis

The Myth of Robust Classes: How Shielding Skews Perceived Stability

TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models

SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions

Patch-wise Retrieval: An Interpretable Instance-Level Image Matching

LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation Models

Keep the Faith: Faithful Explanations in Convolutional Neural Networks for Case-Based Reasoning

Is CLIP ideal? No. Can we fix it? Yes!

Interpretable Open-Vocabulary Referring Object Detection with Reverse Contrast Attention

Hallucinatory Image Tokens: A Training-free EAZY Approach on Detecting and Mitigating Object Hallucinations in LVLMs

Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations

Gradient-based Visual Explanation for Transformer-based CLIP

GFR-CAM: Gram-Schmidt Feature Reduction for Hierarchical Class Activation Maps

Explaining Object Detection Through Difference Map

Do VLMs Have Bad Eyes? Diagnosing Compositional Failures via Mechanistic Interpretability

DCBM: Data-Efficient Visual Concept Bottleneck Models

Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

CoCo-Bot: Energy-based Composable Concept Bottlenecks for Interpretable Generative Models

Causal Interpretation of Sparse Autoencoder Features in Vision

As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain

AIM: Amending Inherent Interpretability via Self-Supervised Masking