Batch Gradient Descent vs Mini-Batch Gradient Descent vs Stochastic Gradient Descent
Batch Gradient Descent
- Each step of gradient descent uses all the training examples.
- Advantage: Achieve global optimum after enough iteration.
- Disadvantage: Large data set. Computationally expensive, or even fail to complete.
Stochastic Gradient Descent
- Each step uses one training example.
- Learning rate α is typically held constant. Can slowly decrease α over time is we want θ to converge
- Advantage: Robot for large data set.
- Disadvantage: Unstable. Move “around” to the optimum, not go straight to the optimum(Batch).
-
NOTE: Shuffling is really important. To avoid ending up at local optimum.
Mini-batch Gradient Descent
- Combine Batch with Stochastic: Use b examples in each iteration. Batch size
- More smoothly, compared to Stochastic.
- Additional parameter: batch_size
相关文章:
-
2022-02-08
-
2021-08-20
-
2021-07-22
-
2021-08-15
猜你喜欢
-
2022-02-27
-
2021-04-17
-
2022-12-23
-
2021-06-19
-
2021-11-24
-
2022-01-21
-
2022-01-18
相关资源
-
下载
2023-02-23
-
下载
2022-12-10
-
下载
2022-12-15
-
下载
2023-03-28
-
下载
2021-07-13