This presentation is part of the International Seminar Series in Causal
Inference. The aim of the seminar series is to bring distinguished
causal inference speakers to Copenhagen and to foster new connections
among local causal inference researchers across different disciplines
and institutions. The seminar is therefore accompanied with three
additional opportunities for connections:
1. A reception
following the presentation. We encourage all to participate, no
registration needed.
2. A lunch with the speaker only for PhD
students on March 13. If you are a PhD student interested in
participating, please sign up by contacting Marie Pramming (marie.pramming@sund.ku.dk). Note that there are
limited seats.
3. A possibility to book a one-on-one meeting with the
speaker on March 12 or March 13. If you are interested in this, please
contact Anne Helby Petersen (ahpe@sund.ku.dk).
The seminar is organized by
the Pioneer Centre for SMARTBiomed and supported by the Danish Data
Science Academy.
Abstract: Numerous causal discovery
algorithms were developed to automatically learn directed acyclic graphs
(DAGs) and other causal models from data. However, their adoption in
applied domains remains limited, as researchers often prefer to
construct DAGs manually based on domain knowledge. This preference
arises due to several practical challenges with automated algorithms,
such as their tendency to produce results that contradict obvious domain
knowledge and their inability to distinguish Markov equivalent models.
To assist researchers in constructing DAGs manually, we propose an
iterative structure learning approach that combines domain knowledge
with data-driven insights. Our method leverages conditional independence
testing to iteratively identify variable pairs where an edge is either
missing or superfluous. Based on this information, we can choose to add
missing edges with appropriate orientation based on domain knowledge or
remove unnecessary ones. We also give a method to rank these missing
edges based on their impact on the overall model fit. In a simulation
study, we find that this iterative approach to leverage domain knowledge
already starts outperforming purely data-driven structure learning if
the orientation of new edge is correctly determined in at least two out
of three cases. We present a proof-of-concept implementation using a
large language model as a domain expert and a graphical user interface
designed to assist human experts with DAG construction.
You can find CSS next to the Botanical Garden, 5 minutes from Nørreport station.
Meeting room 5.2.46 is the library of the Biostatistics section, located in building 5, 2nd floor, room 46. See the map below for directions inside CSS.