Training A Neural Network
Refer to Intro to Neural Networks
Big idea: TRAINING NETWORK MEANS TO MINIMIZE LOSS
Overall Process
Give outputs numbers (0 represents Male, 1 represents Female)
Shift data numbers by mean (normalization)
Calculate the loss function: Measure any mistakes between and .
Use backpropagation to quantify how bad a particular weight is at making mistake.
Use optimization algorithm that tells us how to change weights and biasses to minimize loss (e.g. gradient descent)
Loss
Quantifies how "good" network is at predicting
Trying to minimize this
Mean Squared Error
- Takes average of squared error
Adjusting weights and biases to decrease loss
Assuming we only have 1 item in dataset (for simplicity):
So
We can write loss as a multivariable function of the weights and bias:
Question: How would tweaking affect loss? How do we find this out?
Take partial derivative of with respect to
i.e. Solve for
Solve for
= - Chain Rule
So now, we must solve for and
Solving for (Easy!)
Remember , so simply
Result: =
Solving for (Not as obvious)
Remember that is really just the output. From our neuron calculations before, output is just
So to calculate ...
(Refer to the neural network diagram)
But even now, does not appear in this equation.. so we still can't solve properly, so we need to break this down even further.
Remember that we can also calculate and as...
Now we have ! Since we only see in (meaning that only affects ), we can just include So we can rewrite in a solvable form now.
Result: =
If you want to simplify this further...
Since we've seen multiple times, might as well solve for that too.
So to sum it up...
Calculating :
Result is 0.0214
Means that if increases, the loss also increases (only by a bit because it's a pretty flat slope) - think of a linear graph
Optimization Algorithm
Using stochastic gradient descent (SGD)
η is learning constant
If decreases decreases
Last updated