Gradient Descent in Neural Networks
In this tutorial, we’ll learn about gradient descent in TensorFlow.
What is Gradient Descent?
-
Gradient descent is an optimization algorithm in TensorFlow. It is used to minimize the cost or loss function of a model.
-
The cost function is a measure of how well the model fits the data. If the cost function is low during evaluation, that means the model is performing well, and if it is high, then the model is not accurate.
-
The goal of gradient descent is to find the set of model parameters that results in the lowest cost.
-
It works by iteratively adjusting the model parameters in the direction of the steepest descent of the cost function.
TensorFlow provides a number of built-in optimization algorithms, including gradient descent. To use gradient descent in TensorFlow, first we need to define our model and compile it with the appropriate loss function and optimizer, and then we can call the fit() method to train the model.
Example: Training a model to predict whether a person will buy insurance or not depends on their age and affordability.
Input Data:
Here, we created a csv file with some data, including the age, affordability, and bought columns.
Step 1:
Here, first we imported the required libraries, including Numpy, TensorFlow, Pandas, and Keras. Then, we used the pd.read_csv() function to read the csv file.
Step 2:
Here, we imported "train_test_split" from the "sklearn.model_selection" module. It is used to split the data into training and testing sets.
In this code, x_train and y_train are the training sets for the input and output variables, and x_test and y_test are the testing sets for the input and output variables. These variables are used to train and evaluate the machine learning model.
Step 3:
In this block of code, we scaled the input data between 0 and 1 by dividing it by 100.
This is because when the input data is on a similar scale, it can be easier for the model to learn the patterns and relationships in the data.
Step 4:
In this code, we declare our model with some required parameters. We've then trained the model using the fit() method and the x_train_scaled and y_train data. During training, the model uses gradient descent to adjust the weights and biases of the network in the direction of the steepest descent of the loss function.
Here, epochs represent the number of times the process will repeat itself. It is used to increase the accuracy of the predictions.
Step 5:
In this image, we can clearly see that the prediction for the data at index value 5 (age = 56, affordability = 1) is almost near 0.90, and for the rest of the data, it is below 60. So, according to the prediction, the person with index value 5 will buy the insurance and the rest will not. And it is correct according to our CSV file.