【问题标题】:Find negative log-likelihood cost for logistic regression in python and gradient loss with respect to w,bF找到python中逻辑回归的负对数似然成本和关于w,bF的梯度损失
【发布时间】:2020-07-31 17:41:15
【问题描述】:

计算成本函数的公式:

计算 w,b 梯度损失的公式:

参数:

  1. w -- 权重,一个大小为 (num_px * num_px * 3, 1) 的 numpy 数组
  2. b -- 偏差,一个标量
  3. X -- 数据大小(num_px * num_px * 3,示例数)
  4. Y -- 真正的“标签”向量(如果不是猫则包含 0,如果是猫则包含 1)大小(1,示例数)

返回:

  1. 成本 -- 逻​​辑回归的负对数似然成本

  2. dw -- 损失相对于 w 的梯度,因此形状与 w 相同

  3. db -- 损失相对于 b 的梯度,因此形状与 b 相同

我的代码:

import numpy as np

def sigmoid(z):
    """
    Compute the sigmoid of z

    Arguments:
    z -- A scalar or numpy array of any size.

    Return:
    s -- sigmoid(z)
    """

    ### START CODE HERE ### (≈ 1 line of code)
    s = None
    s = 1 / (1 + np.exp(-z))
    ### END CODE HERE ###

    return s




# GRADED FUNCTION: propagate

def propagate(w, b, X, Y):
    """
    Implement the cost function and its gradient for the propagation explained above



    Tips:
    - Write your code step by step for the propagation. np.log(), np.dot()
    """

    m = X.shape[1]

    # FORWARD PROPAGATION (FROM X TO COST)
    ### START CODE HERE ### (≈ 2 lines of code)
    A = None                                    # compute activation
    cost = None                                 # compute cost
    k = w * X + b  
    A = sigmoid(k)

    cost = (-Y * np.log(A) - (1 - Y) * np.log(1 - A)).mean() / m
    ### END CODE HERE ###

    # BACKWARD PROPAGATION (TO FIND GRAD)
    ### START CODE HERE ### (≈ 2 lines of code)
    dw = None
    db = None
    db = np.subtract(A , Y)
    dw = np.dot(X,db.T)/m
    db = np.sum(db)/m
    ### END CODE HERE ###

    # assert(dw.shape == w.shape)
    # assert(db.dtype == float)
    # cost = np.squeeze(cost)
    # assert(cost.shape == ())

    grads = {"dw": dw,
             "db": db}

    return grads, cost


w, b, X, Y = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]]), np.array([[1,0,1]])
grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))

我的输出:

dw = [[ 0.72851438  0.99581514]                                                                                               
 [ 1.5487967   2.38666712]]                                                                                                   
db = 0.225798060825                                                                                                           
cost = 1.04403235316 

预期输出:

dw = [[ 0.99845601]     [ 2.39507239]]
db = 0.00145557813678
cost = 5.801545319394553

谁能告诉我为什么我的 dw 维度与预期输出不同并帮助找到成本函数?

【问题讨论】:

标签: python logistic-regression backpropagation


【解决方案1】:

有一些小错误,例如您应该使用np.sum(Y*np.log(A) + (1-Y)*np.log(1-A)) / m 代替.mean(),我认为下一个错误是将np.subtract(A-Y) 替换为简单的A-Y bcz。这不需要numpy。它对我有用。

def propagate(w, b, X, Y):
"""
Implement the cost function and its gradient for the propagation explained above

Arguments:
w -- weights, a numpy array of size (num_px * num_px * 3, 1)
b -- bias, a scalar
X -- data of size (num_px * num_px * 3, number of examples)
Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

Return:
cost -- negative log-likelihood cost for logistic regression
dw -- gradient of the loss with respect to w, thus same shape as w
db -- gradient of the loss with respect to b, thus same shape as b

Tips:
- Write your code step by step for the propagation. np.log(), np.dot()
"""

m = X.shape[1]

# FORWARD PROPAGATION (FROM X TO COST)
### START CODE HERE ### (≈ 2 lines of code)
A = sigmoid(np.dot(w.T,X)+b)                                   # compute activation
cost = -np.sum(Y*np.log(A) + (1-Y)*np.log(1-A)) / m                              # compute cost
### END CODE HERE ###  

# BACKWARD PROPAGATION (TO FIND GRAD)
### START CODE HERE ### (≈ 2 lines of code)
dw = np.dot(X,(A-Y).T)/m
db = np.sum(A-Y,axis=1)/m
### END CODE HERE ###

assert(dw.shape == w.shape)
assert(db.dtype == float)
cost = np.squeeze(cost)
assert(cost.shape == ())

grads = {"dw": dw,
         "db": db}

return grads, cost

【讨论】:

    【解决方案2】:
    dw = np.dot(X,db.T)/m 
    

    错了。

    这里应该乘以激活函数的导数,即sigmoid,而不是db,

    A = sigmoid(k)
    dA = np.dot((1-A)*A,dloss.T) # This is the derivative of a sigmoid function
    
    dw = np.dot(X,dA.T)
    

    代码未经测试,但解决方案将沿着这条线。 参见here计算损耗。

    【讨论】:

    • 我尝试了你提到的提示,但仍然存在错误
    • @D_Raja 我建议你仔细研究一下这个理论。为了计算 dA,我们需要损失函数 wrt x 的导数(参见更新)。此外,计算平均成本并将其再次除以 m 是没有意义的。 .sum()/m 可能是正确的实现。
    猜你喜欢
    • 2017-08-16
    • 2020-02-29
    • 2019-09-05
    • 1970-01-01
    • 2021-12-22
    • 2018-05-05
    • 2014-03-21
    • 2021-01-26
    相关资源
    最近更新 更多