Batch Gradient Descent vs Mini-Batch Gradient Descent vs Stochastic gradient descent

Batch Gradient Descent vs Mini-Batch Gradient Descent vs Stochastic Gradient Descent

Batch Gradient Descent
Stochastic Gradient Descent
Mini-batch Gradient Descent

Batch Gradient Descent

Each step of gradient descent uses all the training examples.
Advantage: Achieve global optimum after enough iteration.
Disadvantage: Large data set. Computationally expensive, or even fail to complete.

Stochastic Gradient Descent

Each step uses one training example.
Learning rate $\alpha$ is typically held constant. Can slowly decrease $\alpha$ over time is we want $\theta$ to converge
Advantage: Robot for large data set.
Disadvantage: Unstable. Move “around” to the optimum, not go straight to the optimum(Batch).
NOTE: Shuffling is really important. To avoid ending up at local optimum.

Mini-batch Gradient Descent

Combine Batch with Stochastic: Use b examples in each iteration. Batch size
More smoothly, compared to Stochastic.
Additional parameter: batch_size

相关文章：

2022-02-08
2021-08-20
2021-07-22
2021-08-15

猜你喜欢

2022-02-27
2021-04-17
2022-12-23
2021-06-19
2021-11-24
2022-01-21
2022-01-18

相关资源

下载 2023-02-23
下载 2022-12-10
下载 2022-12-15
下载 2023-03-28
下载 2021-07-13

相似解决方案

热门标签

Java Python linux javascript Mysql C# Docker 算法前端 SpringBoot Redis Vue spring 设计模式 .net core .net kubernetes c++ 数据库数据结构大数据 js 机器学习微服务 Android Go 程序员面试 JVM ASP.net core 云原生人工智能后端 PHP git CSS golang k8s Nginx Django mybatis 深度学习多线程 React 架构 devops 爬虫云计算 Spring Boot LeetCode