Optimization Using Gradient Descent in Two Variables
In this lab, you will implement and visualize the gradient descent method optimizing some functions in two variables. You will have a chance to experiment with the initial parameters, and investigate the results and limitations of the method.
Packages
Run the following cell to load the packages you'll need.
Let's explore a simple example of a function in two variables f(x,y) with one global minimum. Such a function was discussed in the videos, it is predefined and uploaded into this notebook as f_example_3 with its partial derivatives dfdx_example_3 and dfdy_example_3. At this stage, you do not need to worry about the exact expression for that function and its partial derivatives, so you can focus on the implementation of gradient descent and the choice of the related parameters. Run the following cell to plot the function.
To find the minimum, you can implement gradient descent starting from the initial point (x0,y0) and making steps iteration by iteration using the following equations:
where α>0 is a learning rate. Number of iterations is also a parameter. The method is implemented with the following code:
Now to optimize the function, set up the parameters num_iterations, learning_rate, x_initial, y_initial and run gradient descent:
You can see the visualization running the following code. Note that gradient descent in two variables performs steps on the plane, in a direction opposite to the gradient vector [∂x∂f(x0,y0)∂y∂f(x0,y0)] with the learning rate α as a scaling factor.
By uncommenting different lines you can experiment with various sets of the parameter values and corresponding results. At the end of the animation, you can also click on the contour plot to choose the initial point and restart the animation automatically.
Run a few experiments and try to explain what is actually happening in each of the cases.
Let's investigate a more complicated case of a function, which was also shown in the videos:
You can find its global minimum point by using gradient descent with the following parameters:
However, the shape of the surface is much more complicated and not every initial point will bring you to the global minimum of this surface. Use the following code to explore various sets of parameters and the results of gradient descent.