【发布时间】:2018-05-16 09:50:59
【问题描述】:
我正在查看 SVM 损失和导数的代码,我确实了解损失,但我无法理解梯度是如何以矢量化方式计算的
def svm_loss_vectorized(W, X, y, reg):
loss = 0.0
dW = np.zeros(W.shape) # initialize the gradient as zero
num_train = X.shape[0]
scores = X.dot(W)
yi_scores = scores[np.arange(scores.shape[0]),y]
margins = np.maximum(0, scores - np.matrix(yi_scores).T + 1)
margins[np.arange(num_train),y] = 0
loss = np.mean(np.sum(margins, axis=1))
loss += 0.5 * reg * np.sum(W * W)
了解到这里,在这里之后我不明白为什么我们要在二进制矩阵中逐行求和并减去它的总和
binary = margins
binary[margins > 0] = 1
row_sum = np.sum(binary, axis=1)
binary[np.arange(num_train), y] = -row_sum.T
dW = np.dot(X.T, binary)
# Average
dW /= num_train
# Regularize
dW += reg*W
return loss, dW
【问题讨论】: