Coursera吴恩达机器学习week3笔记

Evaluating learning algorithm

Evaluating a Hypothesis

Once we have done some trouble shooting for errors in our predictions by:

Getting more training examples：Fixes high variance
Trying smaller sets of features：Fixes high variance
Trying additional features：Fixes high bias
Trying polynomial features：Fixes high bias
Increasing λ：Fixes high variance
decreasing λ：Fixes high bias

可能有的公式针对训练集已经有很低的错误了，但是依然不够准确，因为这是过拟合的情况。所以为了分析假说公式，我们把数据集分为两类：训练集（70%）和测试集（30%）

Coursera吴恩达机器学习week3笔记

Model Selection and Train/Validation/Test Sets

One way to break down our dataset into the three sets is:

Training set: 60%
Cross validation set: 20%
Test set: 20%

We can now calculate three separate error values for the three different sets using the following method:

Optimize the parameters in Θ using the training set for each polynomial degree.
Find the polynomial degree d with the least error using the cross validation set.
Estimate the generalization error using the test set with J , (d = theta from polynomial with lower error);

This way, the degree of the polynomial d has not been trained using the test set.

Bias vs Variance

Dignosing bisa vs variance

Coursera吴恩达机器学习week3笔记

Regularization and Bias/Variance

Coursera吴恩达机器学习week3笔记

Learning Curve

Coursera吴恩达机器学习week3笔记

Coursera吴恩达机器学习week3笔记

Diagnosing Neural Networks

A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Building a Spam Classifier

Prioritizing What to Work On

Collect lots of data (for example “honeypot” project but doesn’t always work)
Develop sophisticated features (for example: using email header data in spam emails)
Develop algorithms to process your input in different ways (recognizing misspellings in spam).

Coursera吴恩达机器学习week3笔记

Error Analysis

Start with a simple algorithm, implement it quickly, and test it early on your cross validation data.
Plot learning curves to decide if more data, more features, etc. are likely to help.
Manually examine the errors on examples in the cross validation set and try to spot a trend where most of the errors were made.

Handling Skewed Data

Coursera吴恩达机器学习week3笔记

F1 Score: 2*P*R/(P+R)

相关文章：

2021-04-04
2021-11-10
2021-06-21
2021-10-01
2021-04-02
2021-04-16
2021-04-09
2021-05-08

猜你喜欢

2021-09-16
2022-01-05
2022-12-23
2021-12-23
2021-12-28
2021-07-03
2021-09-05

相关资源

下载 2023-02-06
下载 2023-04-02
下载 2023-04-03
下载 2023-02-14

相似解决方案

热门标签

Java Python linux javascript Mysql C# Docker 算法前端 SpringBoot Redis Vue spring 设计模式 .net core .net kubernetes c++ 数据库数据结构大数据 js 机器学习微服务 Android Go 程序员面试 JVM ASP.net core 云原生人工智能后端 PHP git CSS golang k8s Nginx Django mybatis 深度学习多线程 React 架构 devops 爬虫云计算 Spring Boot LeetCode