Date of Graduation

12-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Engineering (PhD)

Degree Level

Graduate

Department

Electrical Engineering and Computer Science

Advisor/Mentor

Zhang, Lu

Committee Member

Li, Qinghua

Second Committee Member

Panda, Brajendra N.

Third Committee Member

Wu, Xintao

Fourth Committee Member

Zhang, Shengfan

Keywords

causal discovery; deep learning; large language model (LLM); machine learning; root cause; time series

Abstract

Causal structure learning from observational data has been an active field of research over the past decades. In the literature, different algorithms and models have been proposed, such as constrained-based methods and score-based methods including the emerging deep learning-based methods. However, most of the approaches apply to static and non-dynamic data only. In many applications, the data is temporal. For example, monitoring systems, weather surveillance systems, and stock data, to name but a few. Incorporating temporal information is an important extension of the causal discovery field. With the growth of observational data these days, the discovery of causal relationships from time series has become possible by observing their behavior over time. However, existing methods for time series causal discovery usually suffer from one or more of the following limitations: (1) assume acyclicity, i.e., assume that the self-causes do not exist or always exist; (2) apply only to discrete time series; (3) based on linear models only; and (4) assume stationarity, i.e., causal dependencies are repeated with the same time lag at all time points. Thus, this thesis addresses the problem of learning a summary causal graph from multivariate time series data that addresses all the above limitations. We present causal discovery algorithms based on both constraint-based and score-based approaches. The goal of this dissertation is to evaluate different techniques of time-series causal discovery for multivariate data, that determine a summary causal graph which is a directed graph representing the underlying causal relationships among the variables in the data, without specifying the exact time lags. In this dissertation, we attempt to leverage the power of deep neural networks and incorporate them into causal discovery strategies including the constraint-based approach, the score-based and noise-based approach, the hybrid approach, and an LLM-based approach. For the constraint-based approach, based on the theory of µ-separation, we develop an algorithm called the µ-PC that extends the well-known PC algorithm to the time domain. It uses a conditional independence testing technique with a Recurrent Neural Network (RNN) to be applicable to both discrete and continuous time series. For the score-based approach, we develop a model named Neural Time-invariant Causal Discovery (NTiCD), which is based on the principle of Granger causality. NTiCD is a continuous optimization-based technique that leverages the power of deep neural networks to compute the score values. To this end, we use an LSTM to obtain the hidden non-linear representations of temporal variables in the time series data. Then, these features are aggregated using graph convolutional networks and decoded using an MLP that outputs the forecast of the future data values in the time series. The model is optimized based on a score function subject to regularized loss. The final output is a summary causal graph that captures the time-invariant causal relations within and between time series. Next, we propose a hybrid approach named Neural-HATS (Neural Hybrid Approach for Time Series Causal Discovery), an innovative framework that combines conditional independence (CI) testing with continuous optimization-based learning algorithms to uncover causal structures in time series data. The approach features an attention-based encoder-decoder architecture integrated with Kernel Conditional Independence (KCI) testing, enabling direct CI tests between time series. These tests are then integrated into continuous optimization learning algorithms for enhanced causal discovery. This integration not only refines the causal inference process but also expands the capabilities of continuous optimization algorithms, significantly improving their performance without the need for extensive computational resources. We evaluate the performance of all our algorithms on several synthetic and real datasets. Following these, next, we implement a noise-based approach called Granger Causal discovery using Residual Independence (GCRI). GCRI framework uses an autoencoder-based approach to uncover Granger causal relationships in time series and the distribution of exogenous variables. Here, we assume that the exogenous variables are mutually independent and impose constraints in the loss function of the encoder to ensure this independence. The encoder models abductive reasoning to derive mutually independent exogenous variables, while the decoder applies deductive reasoning to predict inputs using a limited window of past exogenous variables and time series data. In this way, GCRI effectively performs Granger causal discovery on multivariate time series data using a noise-based approach, where we show the performance using several synthetic and real datasets. We also demonstrate GCRI's effectiveness in identifying the root cause of anomalies, presenting it through a case study. By modeling the distribution of exogenous variables in multivariate time series data, GCRI is able to detect exogenous interventions as the root cause of anomalies. Using synthetic non-linear time series, our framework accurately localizes the source of the anomaly, demonstrating high precision in root cause analysis. This case study underscores the practical applicability of GCRI for anomaly detection in complex time series data. Notably, all our above-mentioned proposed algorithms do not assume data linearity, stationarity, or acyclicity. Finally, we propose a framework that leverages large-language models (LLMs) to enhance the performance of both score-based and constraint-based causal discovery methods. Our LLM-guided initialization approach integrates LLMs with data-driven algorithms, allowing for more precise causal discovery while incorporating valuable domain knowledge. By utilizing LLMs, we generate an initial causal structure from real-world observational data, ensuring that expert knowledge is embedded without violating the core principles of temporal causal discovery. The LLM processes the dataset and its description to create an initial causal graph, which is then refined using traditional temporal causal discovery methods, producing a more accurate, robust, and interpretable causal structure.

Share

COinS