The goals of this exercise session are to:
Don’t forget to add your results to the score board!
If you are working with neural networks, you need a bit of extra data prepping: - Scale the data (for each variable, subtract the training data mean, divide by the training data standard deviation) - Convert input data to format - Convert labels to “one hot deck encoding” (two dummies).
All this is done for you if you run:
source("prepDataForNNs.R")
and now use NN_traindata_x
, NN_traindata_DEATH2YRS
, NN_testdata_x
, NN_testdata_DEATH2YRS
, … in the following exercises.
Below, we fit a “simple” neural network, Mindy, that uses all of the available features (i.e. x-variables) as input and contains a single hidden layer. More specifically, she will have the following structure:
sigmoid
activation function to pass its output on to the next layer.0
, the second probability is the estimated probability of that observation having label 1
.We will use 30 epochs to train her and will use batches of 10 observations for each update of the weights.
Below, we define Mindy. Run the code line by line and make sure you understand roughly what is happening in each step.
#Open the keras package for neural networks
library(keras)
#define Mindy and compile her (i.e. make her ready to be trained)
mindy <- keras_model_sequential()
#Build model structure
mindy %>%
layer_dense(units = 91,
activation = 'sigmoid', input_shape = 91) %>%
layer_dense(units = 2, activation = "softmax")
#Compile: choose settings for how she will be trained
mindy %>% compile(loss = "binary_crossentropy",
optimizer = "rmsprop",
metrics = c("accuracy"))
#Look at the model
summary(mindy)
#train Mindy on the training data
#note: Mindy needs to use the "NN_"-data
mindy_history <- mindy %>% fit(x = NN_traindata_x,
y = NN_traindata_DEATH2YRS,
epochs = 30,
batch_size = 10)
#measure her performance
mindy_perf <- mindy %>% evaluate(NN_testdata_x,
NN_testdata_DEATH2YRS)
#Make predictions from Mindy (probabilities), look at the first ten
mindy_preds <- mindy %>% predict(NN_testdata_x)
head(mindy_preds, 10)
#Predict labels from Mindy and make a confusion matrix
#Note: because of the one hot deck encoding of the NN_testdata
#we need to pick the second column to get the dummy variable
#for the "1" label
mindy_predLabels <- mindy %>% predict_classes(NN_testdata_x)
table(mindy_predLabels, NN_testdata_DEATH2YRS[,2])
#Compute AUC for Mindy
#Note: we need to the choose second column of mindy_preds, as these
#are the probabilities of label 1. Similarly, we choose second
#column of NN_testdata_DEATH2YRS because these are indications
#of whether the label is 1.
library(pROC)
#compute the AUC (area under ROC curve)
mindy_roc <- roc(NN_testdata_DEATH2YRS[,2], mindy_preds[,2])
mindy_roc
Congratulations! You have now fitted and evaluated your first neural network.
Run Mindy’s code again, but now use the following code for the fitting:
mindy_history <- mindy %>% fit(x = NN_traindata_x,
y = NN_traindata_DEATH2YRS,
epochs = 30,
batch_size = 10,
validation_data = list(NN_testdata_x,
NN_testdata_DEATH2YRS))
We will now experiment a bit with adding layers. Using Mindy as a template, you will build a new NN, Brad, that differs from Mindy only in the following aspects:
Brad should have the following layers:
sigmoid
activation function.sigmoid
activation function.sigmoid
activation function.sigmoid
activation function.sigmoid
activation function.softmax
activation function.Here’s what you should do:
Build a neural network and see if you can beat Mindy and previous models from this morning in terms of accuracy. You can experiment with the following:
sigmoid
, relu
, tanh
) for the hidden layers.Choose more or less epochs, see how it impacts performance.
Try controlling the number of epochs via early stopping. Add the argument below to your fit()
call, and see if you can make sense of what it does. Don’t forget to redefine Brad so that you train fresh weights. Try changing the patience
and restore_best_weights
arguments and see what happens.
# Extra argument to try in fit() call:
callbacks = callback_early_stopping(patience = 5, restore_best_weights = TRUE)