TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications
In this section, you will discover the life-cycle for a deep learning model and the two tf.keras APIs that you can use to define models.
A model has a life-cycle, and this very simple knowledge provides the backbone for both modeling a dataset and understanding the tf.keras API.
The five steps in the life-cycle are as follows:
Let’s take a closer look at each step in turn.
Defining the model requires that you first select the type of model that you need and then choose the architecture or network topology. From an API perspective, this involves defining the layers of the model, configuring each layer with a number of nodes and activation function, and connecting the layers together into a cohesive model. Models can be defined either with the Sequential API or the Functional API, and we will take a look at this in the next section.
Compiling the model requires that you first select a loss function that you want to optimize, such as mean squared error or cross-entropy. It also requires that you select an algorithm to perform the optimization procedure, typically stochastic gradient descent, or a modern variation, such as Adam. It may also require that you select any performance metrics to keep track of during the model training process.
From an API perspective, this involves calling a function to compile the model with the chosen configuration, which will prepare the appropriate data structures required for the efficient use of the model you have defined.
The optimizer can be specified as a string for a known optimizer class, e.g. ‘sgd‘ for stochastic gradient descent, or you can configure an instance of an optimizer class and use that.
For a list of supported optimizers, see this:
The three most common loss functions are:
For a list of supported loss functions, see:
Metrics are defined as a list of strings for known metric functions or a list of functions to call to evaluate predictions.
Fitting the model requires that you first select the training configuration, such as the number of epochs (loops through the training dataset) and the batch size (number of samples in an epoch used to estimate model error). Training applies the chosen optimization algorithm to minimize the chosen loss function and updates the model using the backpropagation of error algorithm. Fitting the model is the slow part of the whole process and can take seconds to hours to days, depending on the complexity of the model, the hardware you’re using, and the size of the training dataset. From an API perspective, this involves calling a function to perform the training process. This function will block (not return) until the training process has finished.
For help on how to choose the batch size, see this tutorial:
While fitting the model, a progress bar will summarize the status of each epoch and the overall training process. This can be simplified to a simple report of model performance each epoch by setting the “verbose” argument to 2. All output can be turned off during training by setting “verbose” to 0.
Evaluating the model requires that you first choose a holdout dataset used to evaluate the model. This should be data not used in the training process so that we can get an unbiased estimate of the performance of the model when making predictions on new data.
The speed of model evaluation is proportional to the amount of data you want to use for the evaluation, although it is much faster than training as the model is not changed.
From an API perspective, this involves calling a function with the holdout dataset and getting a loss and perhaps other metrics that can be reported.
Making a prediction is the final step in the life-cycle. It is why we wanted the model in the first place. It requires you have new data for which a prediction is required, e.g. where you do not have the target values. From an API perspective, you simply call a function to make a prediction of a class label, probability, or numerical value: whatever you designed your model to predict. You may want to save the model and later load it to make predictions. You may also choose to fit a model on all of the available data before you start using it. Now that we are familiar with the model life-cycle, let’s take a look at the two main ways to use the tf.keras API to build models: sequential and functional.