This is the 2021 website for the course Advanced Statistical Topics in Health Research B held by the University of Copenhagen. This website will contain the practical information about the course.
Additional information will be given on the first day.
Many modern research projects collect data and use experimental designs that require advanced statistical methods beyond what is taught as part of the curriculum in introductory statistical courses. This course covers some of the more general statistical models and methods suitable for analyzing more complex data and designs encountered in health research such as methods for high-dimensional data, classification, imputation, and dimension reduction.
The course will contain equal parts theory and applications and consists of four full days of teaching and computer lab exercises. It is the intention that the participants will have a thorough understanding of the statistical methods presented and are able to apply them in practice after having followed the course. This course is aimed at health researchers with previous knowledge of statistics and the computer language R who need of an overview about appropriate analytical methods and discussions with statisticians to be able to solve their problem.
A student who has met the objectives of the course will be able to:
We will be using the Gather platform in lieu of traditional platforms such as Zoom or Teams. Please spend 5 minutes to familiarize yourself with Gather before the course starts.
The link to our teaching room is on Gather can be found here. Feel free to try it out before. As you can see the online classroom resembles an old-school computer game.
xto interact with an object close to you.
Before the course starts you should make sure that you have installed the latest version of:
Download and install Anaconda. During the installation, you don’t need to check any boxes or choose anything but the default installation options.
We will be working with a dataset about patients suffering from metastatic castrate-restistant prostate cancer. In order to access the data, you need to register at Project Data Sphere. Fill out a request for data access. This is quite important to do this early (as soon as possible) in order to get access to the data we will be using for the exercises.
Under “Research Description & Goals” you can write that you will be using machine learning methods (including deep learning) to predict survival and treatment discontinuation for prostate cancer patients in lines with the Prostate Cancer DREAM Challenge goals.
We will be using a few specialized R packages that you need to install prior to arriving. Those are necessary for us to run the exercises. The code below should be run in R (e.g. by opening RStudio and copying them into the console) to install the packages that we will need. You need to be connected to the internet to install the packages. If you are a Windows user, you may need to run RStudio as an administrator in order to install the packages. This can be done by right-clicking the RStudio program icon and choosing “Run as administrator”.
Note: At the moment, keras does not work on the new Mac M1 computers (2020 models). If you only have access to such a computer, please contact Anne (ahpe [at] sund.ku.dk) as soon as possible, so she can set up an alternative way for you to work with the package.
If you are working on any other computer, follow this installation guide:
To get keras working, you first need to install the R package:
If the installation succeeds without any errors, you must continue to run the final two lines:
This may take a while and you may be prompted for further installation choices along the way.
Finally, you can test your installation by running the following code:
If your installation was successful, the output should end with the following information (maybe with slightly different numbers):
Results: Trained on 48,000 samples (batch_size=128, epochs=3) Final epoch (plot to see history): loss: 0.1509 accuracy: 0.9561 val_loss: 0.1148 val_accuracy: 0.9649
If not, try running each step again and see if it helps.
Next step is installing
R. Install the
rstan package for R as per these instructions.
rstan is installed we need the
rethinking package. That package is not on CRAN so we will install it directly from github. You do that by running the following three lines:
install.packages(c("devtools","mvtnorm","loo","coda"), repos="https://cloud.r-project.org/",dependencies=TRUE) library("devtools") install_github("rmcelreath/rethinking")
Finally you should install the following bunch of R packages:
install.packages(c("isdals", "rstan", "brms", "rstanarm", "bayesplot", "MESS"))
Claus Ekstrøm 2021