CoCo-Bot: Energy-based Composable Concept Bottlenecks for Interpretable Generative Models

Abstract

Concept Bottleneck Models (CBMs) were originally designed to enhance interpretability in classification by enforcing predictions through human-understandable concepts. Their recent extension to generative models offers semantic-level control via concept interventions. However, existing generative CBMs often depend on auxiliary visual cues at the bottleneck, which can compromise interpretability and compositionality. We propose CoCo-Bot, a post-hoc composable concept bottleneck for generative models that eliminates auxiliary cues and ensures that all semantic control is achieved via explicit concepts. Diffusion-based energy functions enable robust post-hoc interventions, including concept composition and negation. Experiments using StyleGAN2 pre-trained on CelebA-HQ show that CoCo-Bot improves concept-level controllability and interpretability, while maintaining competitive visual quality.

Publication
Non-Proceedings Track