zhongmiaozhimen

第三周课程中,逻辑回归代价函数的求导过程没有具体展开,在此推导并记录:

逻辑回归的代价函数可以统一写成如下一个等式:

$J(\theta ) = -\frac{1}{m}\left[\sum_{i=1}^{m}y^{(i)}log(h_\theta (x^{(i)}))+(1-y^{(i)})log(1-h_\theta (x^{(i)})) \right]$

其中:$h_\theta (x^{(i)}) = \frac{1}{1+e^{-\theta^\mathrm {T} x}}$

为了避免求导过程太冗长复杂,我们做一些显示的简化:

$J(\theta ) = -\frac{1}{m}\left[\sum_{i=1}^{m}K(\theta)\right]$

其中:$K(\theta) = y^{(i)}log(h_\theta (x^{(i)}))+(1-y^{(i)})log(1-h_\theta (x^{(i)}))$

$h_\theta (x^{(i)}) = \frac{1}{1+e^{-\theta^\mathrm {T} x}}$

OK,下面开始我们的推导过程:如果要求$J(\theta)$对某一个参数$\theta$的偏导数,则:

(1)根据求导公式,可以先把常数项$-\frac{1}{m}\sum_{i=1}^{m}$提取出来,这样就只需要对求和符号内部的表达式求导,即:

$J(\theta ){}\' = -\frac{1}{m}\left[\sum_{i=1}^{m}K(\theta){}\'\right]$

$K(\theta){}\' = \left(ylog(h_\theta (x))+(1-y)log(1-h_\theta (x))\right ){}\'$(为方便显示,先把右上角表示第i个样本的上标去掉) 

(2)根据对数复合求导公式,$log(x){}\' = \frac{1}{x}x{}\'$,对$K(\theta)$继续求导可得:

$K(\theta){}\' = y\frac{1}{h_\theta (x)}h_\theta (x){}\'+(1-y)\frac{1}{1-h_\theta (x)}(1-h_\theta (x)){}\'$

(3)根据幂函数复合求导公式,$(y^{x}){}\' = xy^{x-1}x{}\'$,及以e为底的指数求导公式,对$h_\theta(x)$继续求导可得:

$h_\theta (x){}\' = \left( \frac{1}{1+e^{-\theta^\mathrm {T} x}} \right){}\'=-\frac{(1+e^{-\theta^\mathrm {T} x}){}\'}{(1+e^{-\theta^\mathrm {T} x})^{2}} = \frac{e^{-\theta^\mathrm {T}x}(\theta^\mathrm {T} x){}\'}{(1+e^{-\theta^\mathrm {T} x})^{2}} = \left(\frac{1}{1+e^{-\theta^\mathrm{T}x}}(1-\frac{1}{1+e^{-\theta^\mathrm{T}x}})\right)(\theta^\mathrm{T}x){}\' = h_\theta(x)(1-h_\theta(x))(\theta^\mathrm{T}x){}\'$

同理,$(1-h_\theta (x)){}\'= -\frac{e^{-\theta^\mathrm {T}x}(\theta^\mathrm {T} x){}\'}{(1+e^{-\theta^\mathrm {T} x})^{2}} = -h_\theta(x)(1-h_\theta(x))(\theta^\mathrm{T}x){}\'$

(4)把步骤3的结果带入步骤2,化简后可得:

 $K(\theta){}\' = (y-h_\theta(x))(\theta^\mathrm{T}x){}\'$

再把上面结果带入步骤1,化简后可得:

 $J(\theta){}\' = \frac{1}{m}\left[\sum_{i=1}^{m}(h_\theta(x)-y)(\theta^\mathrm{T}x){}\'\right]$

最后$(\theta^\mathrm{T}x){}\'$,对第j个$\theta$求偏导,结果即$X_{j}$(j表示样本中第几项),得到最终结果:

 $\frac{\partial J(\theta)}{\partial \theta_{j}} = \frac{1}{m}\left[\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x_{j}^{(i)}\right]$

分类:

技术点:

相关文章:

  • 2021-04-22
  • 2022-01-11
  • 2021-11-22
  • 2021-10-25
  • 2021-04-17
  • 2019-03-12
  • 2022-12-23
猜你喜欢
  • 2021-05-13
  • 2021-08-21
  • 2021-05-01
  • 2021-08-21
  • 2021-08-21
  • 2021-12-15
相关资源
相似解决方案