Build a Validation Set With TensorFlow's Keras API

video

expand_more

text

expand_more

Build a validation set with TensorFlow's Keras API

In this episode, we'll demonstrate how to use TensorFlow's Keras API to create a validation set on-the-fly during training.

We'll continue working with the same model we built and trained in the previous episode, but first, let's discuss what exactly a validation set is.

What is a validation set?

Recall that we previously built a training set on which we trained our model. With each epoch that our model is trained, the model will continue to learn the features and characteristics of the data in this training set.

The hope is that later we can take this model, apply it to new data, and have the model accurately predict on data that it hasn't seen before based solely on what it learned from the training set.

Now, let's discuss where the addition of a validation set comes into play.

Before training begins, we can choose to remove a portion of the training set and place it in a validation set. Then, during training, the model will train only on the training set, and it will validate by evaluating the data in the validation set.

Essentially, the model is learning the features of the data in the training set, taking what it's learned from this data, and then predicting on the validation set. During each epoch, we will see not only the loss and accuracy results for the training set, but also for the validation set.

This allows us to see how well the model is generalizing on data it wasn't trained on because, recall, the validation data should not be part of the training data.

This also helps us see whether or not the model is overfitting. Overfitting occurs when the model only learns the specifics of the training data and is unable to generalize well on data that it wasn't trained on.

If you'd like to see overfitting covered in further detail, check out the overfitting episode in the Deep Learning Fundamentals series. Note that you can also see a more in depth breakdown of the training set vs. validation set in that series as well.

Now let's discuss how we can create a validation set.

Creating a validation set

There are two ways to create a validation set to use with a tf.keras.Sequential model.

Manually create validation set

The first way is to create a data structure to hold a validation set, and place data directly in that structure in the same nature we did for the training set.

This data structure should be a tuple valid_set = (x_val, y_val) of Numpy arrays or tensors, where x_val is a numpy array or tensor containing validation samples, and y_val is a numpy array or tensor containing validation labels.

When we call model.fit(), we would pass in the validation set in addition to the training set. We pass the validation set by specifying the validation_data parameter.

model.fit(
      x=scaled_train_samples
    , y=train_labels
    , validation_data=valid_set
    , batch_size=10
    , epochs=30
    , verbose=2
)

When the model trains, it would continue to train only on the training set, but additionally, it would also be evaluating the validation set.

Create validation set with Keras

There is another way to create a validation set, and it saves a step!

If we don't already have a specified validation set created, then when we call model.fit(), we can set a value for the validation_split parameter. It expects a fractional number between 0 and 1. Suppose that we set this parameter to 0.1.

model.fit(
      x=scaled_train_samples
    , y=train_labels
    , validation_split=0.1
    , batch_size=10
    , epochs=30
    , verbose=2
)

With this parameter specified, Keras will split apart a fraction (10% in this example) of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch.

Note that the fit() function shuffles the data before each epoch by default. When specifying the validation_split parameter, however, the validation data is selected from the last samples in the x and y data before shuffling.

Therefore, in the case we're using validation_split in this way to create our validation data, we need to be sure that our data has been shuffled ahead of time, like we previously did in an earlier episode.

Interpret Validation Metrics

Now, regardless of which method we use to create validation data, when we call model.fit(), then in addition to loss and accuracy being displayed for each epoch as we saw last time, we will now also see val_loss and val_acc to track the loss and accuracy on the validation set.

model.fit(
      x=scaled_train_samples
    , y=train_labels
    , validation_split=0.1
    , batch_size=10
    , epochs=30
    , verbose=2
)

Epoch 1/30
1890/1890 - 1s - loss: 0.6669 - accuracy: 0.5275 - val_loss: 0.6608 - val_accuracy: 0.5762
Epoch 2/30
1890/1890 - 0s - loss: 0.6508 - accuracy: 0.6217 - val_loss: 0.6419 - val_accuracy: 0.6571
Epoch 3/30
1890/1890 - 0s - loss: 0.6275 - accuracy: 0.7058 - val_loss: 0.6156 - val_accuracy: 0.7381
Epoch 4/30
1890/1890 - 0s - loss: 0.6039 - accuracy: 0.7481 - val_loss: 0.5908 - val_accuracy: 0.7619
Epoch 5/30
1890/1890 - 0s - loss: 0.5791 - accuracy: 0.7783 - val_loss: 0.5635 - val_accuracy: 0.8095
Epoch 6/30
1890/1890 - 0s - loss: 0.5524 - accuracy: 0.7963 - val_loss: 0.5340 - val_accuracy: 0.8286
Epoch 7/30
1890/1890 - 0s - loss: 0.5249 - accuracy: 0.8307 - val_loss: 0.5046 - val_accuracy: 0.8524
Epoch 8/30
1890/1890 - 0s - loss: 0.4974 - accuracy: 0.8492 - val_loss: 0.4751 - val_accuracy: 0.8619
Epoch 9/30
1890/1890 - 0s - loss: 0.4703 - accuracy: 0.8593 - val_loss: 0.4461 - val_accuracy: 0.8810
Epoch 10/30
1890/1890 - 0s - loss: 0.4445 - accuracy: 0.8656 - val_loss: 0.4191 - val_accuracy: 0.8857
Epoch 11/30
1890/1890 - 0s - loss: 0.4208 - accuracy: 0.8810 - val_loss: 0.3945 - val_accuracy: 0.8857
Epoch 12/30
1890/1890 - 0s - loss: 0.3995 - accuracy: 0.8905 - val_loss: 0.3724 - val_accuracy: 0.9048
Epoch 13/30
1890/1890 - 0s - loss: 0.3808 - accuracy: 0.8963 - val_loss: 0.3524 - val_accuracy: 0.9048
Epoch 14/30
1890/1890 - 0s - loss: 0.3643 - accuracy: 0.9021 - val_loss: 0.3354 - val_accuracy: 0.9048
Epoch 15/30
1890/1890 - 0s - loss: 0.3502 - accuracy: 0.9106 - val_loss: 0.3205 - val_accuracy: 0.9238
Epoch 16/30
1890/1890 - 0s - loss: 0.3380 - accuracy: 0.9116 - val_loss: 0.3074 - val_accuracy: 0.9286
Epoch 17/30
1890/1890 - 0s - loss: 0.3277 - accuracy: 0.9138 - val_loss: 0.2965 - val_accuracy: 0.9286
Epoch 18/30
1890/1890 - 0s - loss: 0.3191 - accuracy: 0.9185 - val_loss: 0.2869 - val_accuracy: 0.9286
Epoch 19/30
1890/1890 - 0s - loss: 0.3115 - accuracy: 0.9196 - val_loss: 0.2790 - val_accuracy: 0.9286
Epoch 20/30
1890/1890 - 0s - loss: 0.3053 - accuracy: 0.9238 - val_loss: 0.2717 - val_accuracy: 0.9381
Epoch 21/30
1890/1890 - 0s - loss: 0.2998 - accuracy: 0.9217 - val_loss: 0.2657 - val_accuracy: 0.9381
Epoch 22/30
1890/1890 - 0s - loss: 0.2950 - accuracy: 0.9243 - val_loss: 0.2604 - val_accuracy: 0.9381
Epoch 23/30
1890/1890 - 0s - loss: 0.2912 - accuracy: 0.9238 - val_loss: 0.2556 - val_accuracy: 0.9381
Epoch 24/30
1890/1890 - 0s - loss: 0.2875 - accuracy: 0.9328 - val_loss: 0.2514 - val_accuracy: 0.9381
Epoch 25/30
1890/1890 - 0s - loss: 0.2845 - accuracy: 0.9243 - val_loss: 0.2476 - val_accuracy: 0.9381
Epoch 26/30
1890/1890 - 0s - loss: 0.2818 - accuracy: 0.9275 - val_loss: 0.2445 - val_accuracy: 0.9429
Epoch 27/30
1890/1890 - 0s - loss: 0.2794 - accuracy: 0.9286 - val_loss: 0.2417 - val_accuracy: 0.9429
Epoch 28/30
1890/1890 - 0s - loss: 0.2775 - accuracy: 0.9339 - val_loss: 0.2388 - val_accuracy: 0.9429
Epoch 29/30
1890/1890 - 0s - loss: 0.2753 - accuracy: 0.9302 - val_loss: 0.2363 - val_accuracy: 0.9429
Epoch 30/30
1890/1890 - 0s - loss: 0.2735 - accuracy: 0.9296 - val_loss: 0.2345 - val_accuracy: 0.9429

We can now see not only how well our model is learning the features of the training data, but also how well the model is generalizing to new, unseen data from the validation set. Next, we'll see how to use our model for inference.

quiz

expand_more

resources

expand_more

In this episode, we'll demonstrate how to use TensorFlow's Keras API to create a validation set on-the-fly during training. 🕒🦎 VIDEO SECTIONS 🦎🕒 00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources 00:18 Intro to Validation Sets 03:22 Creating a Validation Set 07:22 Interpret Validation Metrics 09:26 Collective Intelligence and the DEEPLIZARD HIVEMIND 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👋 Hey, we're Chris and Mandy, the creators of deeplizard! 👀 CHECK OUT OUR VLOG: 🔗 https://youtube.com/deeplizardvlog 💪 CHECK OUT OUR FITNESS CHANNEL: 🔗 https://www.youtube.com/channel/UCdCxHNCexDrAx78VfAuyKiA 🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order: 🔗 https://neurohacker.com/shop?rfsn=6488344.d171c6 ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: Mano Prime 👀 Follow deeplizard: Our vlog: https://youtube.com/deeplizardvlog Fitness: https://www.youtube.com/channel/UCdCxHNCexDrAx78VfAuyKiA Facebook: https://facebook.com/deeplizard Instagram: https://instagram.com/deeplizard Twitter: https://twitter.com/deeplizard Patreon: https://patreon.com/deeplizard YouTube: https://youtube.com/deeplizard 🎓 Deep Learning with deeplizard: AI Art for Beginners - https://deeplizard.com/course/sdcpailzrd Deep Learning Dictionary - https://deeplizard.com/course/ddcpailzrd Deep Learning Fundamentals - https://deeplizard.com/course/dlcpailzrd Learn TensorFlow - https://deeplizard.com/course/tfcpailzrd Learn PyTorch - https://deeplizard.com/course/ptcpailzrd Natural Language Processing - https://deeplizard.com/course/txtcpailzrd Reinforcement Learning - https://deeplizard.com/course/rlcpailzrd Generative Adversarial Networks - https://deeplizard.com/course/gacpailzrd Stable Diffusion Masterclass - https://deeplizard.com/course/dicpailzrd 🎓 Other Courses: DL Fundamentals Classic - https://deeplizard.com/learn/video/gZmobeGL0Yg Deep Learning Deployment - https://deeplizard.com/learn/video/SI1hVGvbbZ4 Data Science - https://deeplizard.com/learn/video/d11chG7Z-xk Trading - https://deeplizard.com/learn/video/ZpfCK_uHL9Y 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard's link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.

updates

expand_more

DEEPLIZARD Message notifications

Update history for this page

Did you know you that deeplizard content is regularly updated and maintained?

Updated
Maintained

Spot something that needs to be updated? Don't hesitate to let us know. We'll fix it!

All relevant updates for the content on this page are listed below.

TensorFlow - Python Deep Learning Neural Network API

Build a Validation Set With TensorFlow's Keras API

video

text

Build a validation set with TensorFlow's Keras API

What is a validation set?

Creating a validation set

Manually create validation set

Create validation set with Keras

Interpret Validation Metrics

quiz

Quiz Results

resources

updates

Update history for this page