Handwritten digits recognition (0-9)
Multi-class Logistic Regression
1. Vectorizing Logistic Regression
(1) Vectorizing the cost function
(2) Vectorizing the gradient
(3) Vectorizing the regularized cost function
(4) Vectorizing the regularized gradient
All above 4 formulas can be found in the previous blog: click here.
lrCostFunction.m
1 function [J, grad] = lrCostFunction(theta, X, y, lambda) 2 %LRCOSTFUNCTION Compute cost and gradient for logistic regression with 3 %regularization 4 % J = LRCOSTFUNCTION(theta, X, y, lambda) computes the cost of using 5 % theta as the parameter for regularized logistic regression and the 6 % gradient of the cost w.r.t. to the parameters. 7 8 % Initialize some useful values 9 m = length(y); % number of training examples 10 11 % You need to return the following variables correctly 12 J = 0; 13 grad = zeros(size(theta)); 14 15 % ====================== YOUR CODE HERE ====================== 16 % Instructions: Compute the cost of a particular choice of theta. 17 % You should set J to the cost. 18 % Compute the partial derivatives and set grad to the partial 19 % derivatives of the cost w.r.t. each parameter in theta 20 % 21 % Hint: The computation of the cost function and gradients can be 22 % efficiently vectorized. For example, consider the computation 23 % 24 % sigmoid(X * theta) 25 % 26 % Each row of the resulting matrix will contain the value of the 27 % prediction for that example. You can make use of this to vectorize 28 % the cost function and gradient computations. 29 % 30 % Hint: When computing the gradient of the regularized cost function, 31 % there're many possible vectorized solutions, but one solution 32 % looks like: 33 % grad = (unregularized gradient for logistic regression) 34 % temp = theta; 35 % temp(1) = 0; % because we don't add anything for j = 0 36 % grad = grad + YOUR_CODE_HERE (using the temp variable) 37 % 38 39 hx = sigmoid(X*theta); 40 reg = lambda/(2*m)*sum(theta(2:size(theta),:).^2); 41 J = -1/m*(y'*log(hx)+(1-y)'*log(1-hx)) + reg; 42 theta(1) = 0; 43 grad = 1/m*X'*(hx-y)+lambda/m*theta; 44 45 46 47 48 49 50 % ============================================================= 51 52 grad = grad(:); 53 54 end