Author ORCID Identifier:
Date of Graduation
5-2026
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Computer Science (PhD)
Degree Level
Graduate
Department
Computer Science & Computer Engineering
Advisor/Mentor
Wu, Xintao
Committee Member
Panda, Brajendra
Second Committee Member
Zhang, Lu
Third Committee Member
Arnold, Mark
Keywords
Causality; Generative Modeling; Representation Learning; Trustworthy Artificial Intelligence
Abstract
The hallmark of human intelligence is causal reasoning, the ability to infer relationships between causes and effects through observation and intervention. While modern deep learning has excelled at identifying statistical patterns, current generative models often struggle to capture the underlying structural causal mechanisms of the data-generating process, leaving them vulnerable to shortcut learning and spurious associations. To achieve true generalizability and interpretability, artificial intelligence must transition from simple association to higher-level causal reasoning to be capable of scheduling and planning in the real world. This dissertation develops fundamental methodologies for causal generative modeling by integrating Pearl’s Structural Causal Model (SCM) formalism with deep generative architectures. Specifically, this research aims to address critical gaps in generative modeling, including (1) how can we develop a new notion of disentanglement for causally-related generative factors and a flexible causal representation learning framework with theoretical guarantees? (2) how can we integrate causal modeling into the training process of state-of-the-art diffusion generative models to enable high-fidelity counterfactual generation? (3) how can we robustly frame and evaluate causal reasoning of pre-trained large vision-language models? (4) how can we utilize the strengths of generative foundation models to develop a unified inference-time framework for causal generative modeling from concept discovery to counterfactual generation? To address these questions, we develop the following methods: 1. We introduce ICM-VAE, a variational Bayes framework that leverages structured priors inspired by the Principle of Independent Causal Mechanisms to learn disentangled latent causal factors. We theoretically and empirically demonstrate that this approach enables the recovery of modular and disentangled causal mechanisms. 2. We propose CausalDiffAE, a novel framework integrating the learning of causal mechanisms into the training process of diffusion probabilistic models to enable high-fidelity counterfactual image generation. 3. We investigate reasoning abilities of pretrained large vision-language models through a causal lens. We propose CausalVLBench, a benchmarking framework that evaluates the formal causal reasoning capabilities of Large Vision-Language Models (LVLMs) across three novel tasks: causal structure inference, intervention target prediction, and counterfactual prediction. 4. We present a unified paradigm, the Foundation Model Powered Causal Generative Model (FM-CGM), a framework that utilizes LVLMs for concept inference and text-to-image diffusion models for counterfactual generation. Within this paradigm, we develop Causal Semantic Guidance, an inference-time diffusion-based editing method that performs minimal and faithful counterfactual image edits. Collectively, these methodologies provide a robust framework for building AI systems that do not merely mimic data distributions but understand and manipulate causal variables. This shift has significant implications for high-stakes domains such as healthcare and scientific discovery.
Citation
Komanduri, A. (2026). Toward Causal Generative Modeling: From Representation Learning to Controllable Generation. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/6165