DL-1-week4-Deep neural network

4 Deep neural network

4 Deep neural network

4.1 Deep L-layer Neural network

Logistic regression is a shallow model.

$L$ : the number of layers in the network.
$n^{[l]}$ : the number of nodes or the number of units in layer $l$
$a^{[l]}$ : the activations in layer $l$ , $a^{[l]} = g^{[l]} (z^{[l]})$
$w^{[l]}$ : the weights for computing the values $z^{[l]}$ in the $a^{[l]}$
$X = a^{[0]}, \hat{y} = a^{[L]}$

4.2 Forward Propagatin in a Deep Network

skip

4.3 Getting your matrix dimensions right

skip

4.4 Why deep representations?

Example:
Face detection: The first layer in deep neural network maybe being a feature detector or an edge detector. Intuitively you can think of the earlier layers of the neural network is detecting simpler functions like edges and then composing them together in the later layers of a neural network, so that they can learn one more complex functions.
DL-1-week4-Deep neural network
Circuit theory and deep learning
Informally:There are functions you can compute with a “small” L-layer deep neural network that shallower networks require exponentially more hidden units to compute.

4.5&4.6 Building blocks of deep neural networks & Forward and backward propagation

Forward and back ward functions
DL-1-week4-Deep neural network

4.7 Parameters vs Hyper parameters

Parameters:

W^{[1]}, b^{[1]}, W^{[2]}, b^{[2]}, W^{[3]}, b^{[3]}

Hyper Parameters:
- Learning rate

α

- #iterations
- # hidden layers(

L

)
- # hidden units(

n^{[1]}, n^{[2]}, . . .

)
- choice of activation funciton
These hyper parameters can control W and b.
Other hyper parameters:
- momentum
- mini batch size
- regularization parameters
- …
Applied deep learnign is a very empircial process

4.8 What does this have to do with the brain?

DL-1-week4-Deep neural network