\n \n \n
\n
\n\n \n \n \n \n \n \n iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models.\n \n \n \n \n\n\n \n Chen, T.; Bello, K.; Aragam, B.; and Ravikumar, P.\n\n\n \n\n\n\n
NeurIPS-23. Advances in Neural Information Processing Systems. 2023.\n
\n\n
\n\n
\n\n
\n\n \n \n proceedings\n \n \n \n preprint\n \n \n \n code\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n \n \n 20 downloads\n \n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{chen2023iscan,\n title={iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models},\n author={Tianyu Chen and Kevin Bello and Bryon Aragam and Pradeep Ravikumar},\n journal={<span style="color: #0088cc; font-style: normal">NeurIPS-23.</span> Advances in Neural Information Processing Systems},\n year={2023},\n abstract={Structural causal models (SCMs) are widely used in various disciplines to represent causal relationships among variables in complex systems. Unfortunately, the underlying causal structure is often unknown, and estimating it from data remains a challenging task. In many situations, however, the end goal is to localize the changes (shifts) in the causal mechanisms between related datasets instead of learning the full causal structure of the individual datasets. Some applications include root cause analysis, analyzing gene regulatory network structure changes between healthy and cancerous individuals, or explaining distribution shifts. This paper focuses on identifying the causal mechanism shifts in two or more related datasets over the same set of variables—without estimating the entire DAG structure of each SCM. Prior work under this setting assumed linear models with Gaussian noises; instead, in this work we assume that each SCM belongs to the more general class of nonlinear additive noise models (ANMs). A key technical contribution of this work is to show that the Jacobian of the score function for the mixture distribution allows for the identification of shifts under general non-parametric functional mechanisms. Once the shifted variables are identified, we leverage recent work to estimate the structural differences, if any, for the shifted variables. Experiments on synthetic and real-world data are provided to showcase the applicability of this approach. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/iSCAN.},\n url_Proceedings={https://openreview.net/pdf?id=GEtXhqKW6X},\n url_Preprint={https://arxiv.org/pdf/2306.17361.pdf},\n url_Code={https://github.com/kevinsbello/iscan/},\n keyword={Causality and Graphical Models,Distribution Shifts,Latent Variable Models and Generative Models,Statistical Machine Learning,Interpretability and Fairness},\n}\n\n
\n
\n\n\n
\n Structural causal models (SCMs) are widely used in various disciplines to represent causal relationships among variables in complex systems. Unfortunately, the underlying causal structure is often unknown, and estimating it from data remains a challenging task. In many situations, however, the end goal is to localize the changes (shifts) in the causal mechanisms between related datasets instead of learning the full causal structure of the individual datasets. Some applications include root cause analysis, analyzing gene regulatory network structure changes between healthy and cancerous individuals, or explaining distribution shifts. This paper focuses on identifying the causal mechanism shifts in two or more related datasets over the same set of variables—without estimating the entire DAG structure of each SCM. Prior work under this setting assumed linear models with Gaussian noises; instead, in this work we assume that each SCM belongs to the more general class of nonlinear additive noise models (ANMs). A key technical contribution of this work is to show that the Jacobian of the score function for the mixture distribution allows for the identification of shifts under general non-parametric functional mechanisms. Once the shifted variables are identified, we leverage recent work to estimate the structural differences, if any, for the shifted variables. Experiments on synthetic and real-world data are provided to showcase the applicability of this approach. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/iSCAN.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Global Optimality in Bivariate Gradient-based DAG Learning.\n \n \n \n \n\n\n \n Deng, C.; Bello, K.; Aragam, B.; and Ravikumar, P.\n\n\n \n\n\n\n
NeurIPS-23. Advances in Neural Information Processing Systems. 2023.\n
\n\n
\n\n
\n\n
\n\n \n \n proceedings\n \n \n \n preprint\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n \n \n 7 downloads\n \n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n\n\n\n
\n
@article{deng2023global,\n title={Global Optimality in Bivariate Gradient-based DAG Learning},\n author={Deng, Chang and Bello, Kevin and Aragam, Bryon and Ravikumar, Pradeep},\n journal={<span style="color: #0088cc; font-style: normal">NeurIPS-23.</span> Advances in Neural Information Processing Systems},\n year={2023},\n abstract={Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order optimization schemes to solve this problem, proving the global optimality of such approaches has proven elusive. The difficulty lies in the fact that unlike other non-convex problems in the literature, this problem is not "benign", and possesses multiple spurious solutions that standard approaches can easily get trapped in. In this paper, we prove that a simple path-following optimization scheme globally converges to the global minimum of the population loss in the bivariate setting.},\n url_Proceedings={https://openreview.net/pdf?id=5MG5C5aS6m},\n url_Preprint={https://arxiv.org/pdf/2306.17378.pdf},\n keyword={Causality and Graphical Models,Optimization},\n}\n\n
\n
\n\n\n
\n Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order optimization schemes to solve this problem, proving the global optimality of such approaches has proven elusive. The difficulty lies in the fact that unlike other non-convex problems in the literature, this problem is not \"benign\", and possesses multiple spurious solutions that standard approaches can easily get trapped in. In this paper, we prove that a simple path-following optimization scheme globally converges to the global minimum of the population loss in the bivariate setting.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Optimizing NOTEARS objectives via topological swaps.\n \n \n \n \n\n\n \n Deng, C.; Bello, K.; Aragam, B.; and Ravikumar, P. K.\n\n\n \n\n\n\n
ICML-23. Proceedings of the 40th International Conference on Machine Learning, 202: 7563–7595. 2023.\n
\n\n
\n\n
\n\n
\n\n \n \n proceedings\n \n \n \n preprint\n \n \n \n code\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n \n \n 8 downloads\n \n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n\n\n\n
\n
@article{deng2023optimizing,\n title={Optimizing NOTEARS objectives via topological swaps},\n author={Deng, Chang and Bello, Kevin and Aragam, Bryon and Ravikumar, Pradeep Kumar},\n journal={<span style="color: #0088cc; font-style: normal">ICML-23.</span> Proceedings of the 40th International Conference on Machine Learning},\n pages = {7563--7595},\n volume = {202},\n year={2023},\n abstract = {Recently, an intriguing class of non-convex optimization problems has emerged in the context of learning directed acyclic graphs (DAGs). These problems involve minimizing a given loss or score function, subject to a non-convex continuous constraint that penalizes the presence of cycles in a graph. In this work, we delve into the optimality challenges associated with this class of non-convex programs. To address these challenges, we propose a bi-level algorithm that leverages the non-convex constraint in a novel way. The outer level of the algorithm optimizes over topological orders by iteratively swapping pairs of nodes within the topological order of a DAG. A key innovation of our approach is the development of an effective method for generating a set of candidate swapping pairs for each iteration. At the inner level, given a topological order, we utilize off-the-shelf solvers that can handle linear constraints. The key advantage of our proposed algorithm is that it is guaranteed to find a local minimum or a KKT point under weaker conditions compared to previous work and finds solutions with lower scores. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches in terms of achieving a better score. Additionally, our method can also be used as a post-processing algorithm to significantly improve the score of other algorithms. Code implementing the proposed method is available at https://github.com/duntrain/topo.},\n url_Proceedings={https://proceedings.mlr.press/v202/deng23a.html},\n url_Preprint={https://arxiv.org/pdf/2305.17277.pdf},\n url_Code={https://github.com/duntrain/topo/},\n keyword={Causality and Graphical Models,Optimization},\n}\n\n
\n
\n\n\n
\n Recently, an intriguing class of non-convex optimization problems has emerged in the context of learning directed acyclic graphs (DAGs). These problems involve minimizing a given loss or score function, subject to a non-convex continuous constraint that penalizes the presence of cycles in a graph. In this work, we delve into the optimality challenges associated with this class of non-convex programs. To address these challenges, we propose a bi-level algorithm that leverages the non-convex constraint in a novel way. The outer level of the algorithm optimizes over topological orders by iteratively swapping pairs of nodes within the topological order of a DAG. A key innovation of our approach is the development of an effective method for generating a set of candidate swapping pairs for each iteration. At the inner level, given a topological order, we utilize off-the-shelf solvers that can handle linear constraints. The key advantage of our proposed algorithm is that it is guaranteed to find a local minimum or a KKT point under weaker conditions compared to previous work and finds solutions with lower scores. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches in terms of achieving a better score. Additionally, our method can also be used as a post-processing algorithm to significantly improve the score of other algorithms. Code implementing the proposed method is available at https://github.com/duntrain/topo.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Bayesian Dynamic DAG Learning: Application in Discovering Dynamic Effective Connectome of Brain.\n \n \n \n \n\n\n \n Bagheri, A.; Pasande, M.; Bello, K.; Akhondi-Asl, A.; and Araabi, B. N.\n\n\n \n\n\n\n ArXiv:2309.07080, 2023.\n
\n\n
\n\n
\n\n
\n\n \n \n preprint\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n \n \n 12 downloads\n \n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n\n\n\n
\n
@misc{bagheri2023bayesian,\n title={Bayesian Dynamic DAG Learning: Application in Discovering Dynamic Effective Connectome of Brain},\n author={Bagheri, Abdolmahdi and Pasande, Mohammad and Bello, Kevin and Akhondi-Asl, Alireza and Araabi, Babak Nadjar},\n howpublished = {ArXiv:2309.07080},\n abstract={Understanding the complex mechanisms of the brain can be unraveled by extracting the Dynamic Effective Connectome (DEC). Recently, score-based Directed Acyclic Graph (DAG) discovery methods have shown significant improvements in extracting the causal structure and inferring effective connectivity. However, learning DEC through these methods still faces two main challenges: one with the fundamental impotence of high-dimensional dynamic DAG discovery methods and the other with the low quality of fMRI data. In this paper, we introduce Bayesian Dynamic DAG learning with M-matrices Acyclicity characterization (BDyMA) method to address the challenges in discovering DEC. The presented dynamic causal model enables us to discover bidirected edges as well. Leveraging an unconstrained framework in the BDyMA method leads to more accurate results in detecting high-dimensional networks, achieving sparser outcomes, making it particularly suitable for extracting DEC. Additionally, the score function of the BDyMA method allows the incorporation of prior knowledge into the process of dynamic causal discovery which further enhances the accuracy of results. Comprehensive simulations on synthetic data and experiments on Human Connectome Project (HCP) data demonstrate that our method can handle both of the two main challenges, yielding more accurate and reliable DEC compared to state-of-the-art and baseline methods. Additionally, we investigate the trustworthiness of DTI data as prior knowledge for DEC discovery and show the improvements in DEC discovery when the DTI data is incorporated into the process.},\n url_Preprint={https://arxiv.org/abs/2309.07080},\n year={2023},\n keyword={Causality and Graphical Models,Applications},\n}\n\n
\n
\n\n\n
\n Understanding the complex mechanisms of the brain can be unraveled by extracting the Dynamic Effective Connectome (DEC). Recently, score-based Directed Acyclic Graph (DAG) discovery methods have shown significant improvements in extracting the causal structure and inferring effective connectivity. However, learning DEC through these methods still faces two main challenges: one with the fundamental impotence of high-dimensional dynamic DAG discovery methods and the other with the low quality of fMRI data. In this paper, we introduce Bayesian Dynamic DAG learning with M-matrices Acyclicity characterization (BDyMA) method to address the challenges in discovering DEC. The presented dynamic causal model enables us to discover bidirected edges as well. Leveraging an unconstrained framework in the BDyMA method leads to more accurate results in detecting high-dimensional networks, achieving sparser outcomes, making it particularly suitable for extracting DEC. Additionally, the score function of the BDyMA method allows the incorporation of prior knowledge into the process of dynamic causal discovery which further enhances the accuracy of results. Comprehensive simulations on synthetic data and experiments on Human Connectome Project (HCP) data demonstrate that our method can handle both of the two main challenges, yielding more accurate and reliable DEC compared to state-of-the-art and baseline methods. Additionally, we investigate the trustworthiness of DTI data as prior knowledge for DEC discovery and show the improvements in DEC discovery when the DTI data is incorporated into the process.\n
\n\n\n
\n\n\n\n\n\n