Twenty years ago, the late Leo Breiman sent a wake-up call to the
statistical community, thereby criticizing the dominant use of `data
models’ (Breiman, 2001). In this talk, I will revisit his critiques in
light of the developments on algorithmic modeling, debiased machine
learning and targeted learning that have taken place over the past 2
decades, largely within the causal inference literature (Vansteelandt,
2021). I will argue that these developments resolve Breiman’s critiques,
but are not ready for mainstream use by researchers without in-depth
training in causal inference. They focus almost exclusively on
evaluating the effects of dichotomous exposures; when even slightly more
complex settings are envisaged, then this restrictive focus encourages
poor practice (such as dichotomization of a continuous exposure) or
makes users revert to the traditional modeling culture. Moreover, while
there is enormous value in the ability to quantify the effects of
specific interventions, this focus is also artificial in the many
scientific studies where no specific interventions are
I will accommodate these concerns via a general conceptual framework on assumption-lean regression, which I recently introduced in a discussion paper that was read before the Royal Statistical Society (Vansteelandt and Dukes, 2022). This framework builds heavily on the debiased / targeted machine learning literature, but intends to be as broadly useful as standard regression methods, while continuing to resolve Breiman’s concerns and other typical concerns about regression.
A large part of this talk will be conceptual and is aimed to be widely accessible; parts of the talk will demonstrate in more detail how assumption-lean regression works in the context of generalised linear models and Cox proportional hazard models (Vansteelandt et al., 2022).
Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3), 199-231.
Vansteelandt, S. (2021). Statistical Modelling in the Age of Data Science. Observational Studies, 7(1), 217-228.
Vansteelandt, S and Dukes, O. (2022) Assumption-lean inference for generalised linear model parameters (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology), 84(3), 657– 685.
Vansteelandt, S., Dukes, O., Van Lancker, K., & Martinussen, T. (2022). Assumption-lean Cox regression. Journal of the American Statistical Association, 1-10.
Using analysis of covariance to improve the efficiency of clinical
trials has a long tradition within drug development and is explicitly
recognised as being a valuable thing to do by regulatory guidelines.
Nevertheless it continues to attract criticism and it also raises
various issues. In this talk I shall look at some of them in
1. What the difference is between stratification and analysis of covariance.
2. How this relates to type I and type II sums of squares.
3. Whether propensity score adjustment is a valid alternative to analysis of covariance.
4. What problems arise in connection with hierarchical data.
5. What the Rothamsted approach teaches us and its relevance to Lord’s paradox.
6. What changes when we move from common two parameter models, such as the Normal model, to single parameter models such as the Poisson distribution.
7. Whether marginal or conditional estimates are generally to be preferred or of there is a role for both.
8. What care must be taken when considering covariate by treatment interaction.
I shall conclude that using covariates wisely does require care but it is valuable and that despite the general regulatory approval, underused and that it would make a much bigger contribution to design efficiency than the currently fashionable topic of flexible designs.
Will appear later
You can find CSS next to the Botanical Garden, 5 minutes from Nørreport station.
Meeting room 5.2.46 is the library of the Biostatistics section, located in building 5, 2nd floor, room 46. See the map below for directions inside CSS.