Backpropagation explained | Part 4 – Calculating the gradient

We’re now on video number 4 in our journey through understanding backpropagation. In our last video, we focused on how we can mathematically express certain facts about the training process. Now we’re going to be using these expressions to help us differentiate the loss of the neural network with respect to the weights.

Recall from our video that covered the intuition for backpropagation, that, for stochastic gradient descent to update the weights of the network, it first needs to calculate the gradient of the loss with respect to these weights. And calculating this gradient, is exactly what we’ll be focusing on in this video.

We’re first going to start out by checking out the equation that backprop uses to differentiate the loss with respect to weights in the network. We’ll see that this equation is made up of multiple terms, so next we’ll break down and focus on each of these terms individually. Lastly, we’ll take the results from each term and combine them to obtain the final result, which will be the gradient of the loss function.

Follow deeplizard on Steemit:
https://steemit.com/@deeplizard

Support deeplizard:
Bitcoin: 1AFgm3fLTiG5pNPgnfkKdsktgxLCMYpxCN
Litecoin: LTZ2AUGpDmFm85y89PFFvVR5QmfX6Rfzg3