[Theorem] Optimizing Neural Network

1. Unrolling Parameter

In neural network, there are differneces at advanced optimziation. In logistic regression. our parameter $\theta$ is a vector only has a one column. But in neural network, activation node isn't a vector but matrix. So if we want to do back propagation, we need to do unrolling parameters.

2. Gradient Checking

One property of back propagation is that there are many ways to have subtle bugs. For examples, it looks $J(\Theta$\) is decreasing, but we might just wind up with a neural network that has a higher level of error that we would with a bug free implementation. To solve this problem, we need to use gradient checking.

With specific $\Theta$, we set two points of $\Theta$ which $\Theta - \epsilon$, $\Theta + \epsilon$.

$$ \frac{J(\Theta +\epsilon )-J(\Theta -\epsilon )}{2\epsilon } $$

And above term means eclipse of two points. With small value of epsilon such as $\epsilon =10^{-4}$. We can confimed that this term can be similary equaled with derivative term.

Total Implementation of Backpropagation is :

Implement backprop to compute DVec(unrolled).
Implement numerical gradient checking to compute gradApprox.
Make sure they are simillary same.
Turn off gradient checking

3. Random Initialization

Initializing all $\theta$ weights to zero does not work with neural networks. When we do back propagation, all nodes will update to the same value repeatedly. Instead we can randomly initialize our weights for our $\Theta$ matrix got random value $-\epsilon ~ +\epsilon$.

4. Summarization

To make a good neural network system, we need to consider how many hidden units in each layer and how many layers.

Number of input units
Number of output units
Number of hidden layer + hidden units

Default setting is 1 hidden layer, and if we have more than 1 hidden layer, then it is recommended that we have the same number of units.

'Data Science > Neural Network' 카테고리의 다른 글

[Tensorflow] Overfitting and Underfitting (0)	2022.09.21
[Tensorflow] Stochastic Gradient Descent (0)	2022.09.21
[Tensorflow] Deep Neural Networks (0)	2022.09.21
[Tensorflow] A Single Neuron (0)	2022.09.20
[Theorem] Neural Network (1)	2022.09.19

See the forest

[Theorem] Optimizing Neural Network

1. Unrolling Parameter

2. Gradient Checking

3. Random Initialization

4. Summarization

'Data Science > Neural Network' 카테고리의 다른 글

티스토리툴바

[Theorem] Optimizing Neural Network

1. Unrolling Parameter

2. Gradient Checking

3. Random Initialization

4. Summarization

'Data Science > Neural Network' 카테고리의 다른 글

'Data Science/Neural Network' Related Articles

티스토리툴바