Lecture 3 – Neural Networks


1. Course plan: coming up

[cs224n] Lecture 3 – Neural Networks

 Homeworks

[cs224n] Lecture 3 – Neural Networks

A note on your experience!

[cs224n] Lecture 3 – Neural Networks

Lecture Plan

[cs224n] Lecture 3 – Neural Networks


2. Classification setup and notation

[cs224n] Lecture 3 – Neural Networks

Classification intuition

[cs224n] Lecture 3 – Neural Networks

Details of the softmax classifier

[cs224n] Lecture 3 – Neural Networks

Training with softmax and cross-entropy loss

[cs224n] Lecture 3 – Neural Networks

Background: What is “cross entropy” loss/error?

[cs224n] Lecture 3 – Neural Networks

Classification over a full dataset

[cs224n] Lecture 3 – Neural Networks

Traditional ML optimization

[cs224n] Lecture 3 – Neural Networks


3. Neural Network Classifiers

[cs224n] Lecture 3 – Neural Networks

Neural Nets for the Win!

[cs224n] Lecture 3 – Neural Networks

Classification difference with word vectors

[cs224n] Lecture 3 – Neural Networks

Neural computation

[cs224n] Lecture 3 – Neural Networks

An artificial neuron

[cs224n] Lecture 3 – Neural Networks

A neuron can be a binary logistic regression unit

[cs224n] Lecture 3 – Neural Networks

A neural network = running several logistic regressions at the same time

[cs224n] Lecture 3 – Neural Networks

[cs224n] Lecture 3 – Neural Networks

[cs224n] Lecture 3 – Neural Networks

Matrix notation for a layer

[cs224n] Lecture 3 – Neural Networks

Non-linearities (aka “f ”): Why they’re needed

[cs224n] Lecture 3 – Neural Networks


4. Named Entity Recognition (NER)

[cs224n] Lecture 3 – Neural Networks

Named Entity Recognition on word sequences

[cs224n] Lecture 3 – Neural Networks

Why might NER be hard?

[cs224n] Lecture 3 – Neural Networks


5. Binary word window classification

[cs224n] Lecture 3 – Neural Networks

Window classification

[cs224n] Lecture 3 – Neural Networks

Window classification: Softmax

[cs224n] Lecture 3 – Neural Networks

Simplest window classifier: Softmax

[cs224n] Lecture 3 – Neural Networks

Binary classification with unnormalized scores

[cs224n] Lecture 3 – Neural Networks

Binary classification for NER Location

[cs224n] Lecture 3 – Neural Networks

Neural Network Feed-forward Computation

[cs224n] Lecture 3 – Neural Networks

Main intuition for extra layer

[cs224n] Lecture 3 – Neural Networks

The max-margin loss

[cs224n] Lecture 3 – Neural Networks

[cs224n] Lecture 3 – Neural Networks

Simple net for score

[cs224n] Lecture 3 – Neural Networks

Remember: Stochastic Gradient Descent

[cs224n] Lecture 3 – Neural Networks

Computing Gradients by Hand

[cs224n] Lecture 3 – Neural Networks

Gradients

[cs224n] Lecture 3 – Neural Networks

[cs224n] Lecture 3 – Neural Networks

Jacobian Matrix: Generalization of the Gradient

[cs224n] Lecture 3 – Neural Networks

Chain Rule

[cs224n] Lecture 3 – Neural Networks

Example Jacobian: Elementwise activation Function

[cs224n] Lecture 3 – Neural Networks

Other Jacobians

[cs224n] Lecture 3 – Neural Networks

Back to our Neural Net!

 [cs224n] Lecture 3 – Neural Networks

1. Break up equations into simple pieces

[cs224n] Lecture 3 – Neural Networks

2. Apply the chain rule

3. Write out the Jacobians

[cs224n] Lecture 3 – Neural Networks

Re-using Computation

[cs224n] Lecture 3 – Neural Networks

[cs224n] Lecture 3 – Neural Networks

Derivative with respect to Matrix: Output shape

[cs224n] Lecture 3 – Neural Networks

Derivative with respect to Matrix

[cs224n] Lecture 3 – Neural Networks

Why the Transposes?

[cs224n] Lecture 3 – Neural Networks

 [cs224n] Lecture 3 – Neural Networks

What shape should derivatives be?

[cs224n] Lecture 3 – Neural Networks

[cs224n] Lecture 3 – Neural Networks

Next time: Backpropagation

[cs224n] Lecture 3 – Neural Networks

相关文章:

  • 2021-11-09
  • 2021-12-20
  • 2021-10-06
  • 2021-07-05
  • 2021-11-15
  • 2021-06-04
  • 2022-12-23
  • 2021-09-08
猜你喜欢
  • 2021-07-04
  • 2021-06-14
  • 2021-09-12
  • 2021-10-13
  • 2021-09-04
  • 2021-08-08
  • 2021-07-04
相关资源
相似解决方案