逻辑回归的小批量梯度下降预测错误？答案

【问题标题】：Wrong predictions in Mini Batch Gradient Descent for Logistic Regression?逻辑回归的小批量梯度下降预测错误？
【发布时间】：2021-01-23 15:05:06
【问题描述】：

我正在尝试为逻辑回归实现小批量梯度下降。但是，当我尝试使用标签 {-1, 1} 在我的数据集上对其进行测试时，似乎我的预测几乎总是 1 或 -1，使我的测试分数约为 50%（因为真正的标签大约为 50%）。 50/50 介于 -1 和 1) 之间，当目标高于 95 % 时。

谁能帮助找出我下面代码中的错误？

def logistic(z):
    """ 
    Helper function
    Computes the logistic function 1/(1+e^{-x}) to each entry in input vector z.
    
    Args:
        z: numpy array shape (,d) 
    Returns:
       logi: numpy array shape (,d) each entry transformed by the logistic function 
    """
    logi = np.zeros(z.shape)
    logi = np.array([1 / (1+np.exp(-z[i])) for i in range(len(z))])
    assert logi.shape == z.shape
    return logi

class LogisticRegressionClassifier():

    def __init__(self):
        self.w = None


    def fit(self, X, y, w=None, lr=0.1, batch_size=16, epochs=10):
        """
        Run mini-batch Gradient Descent for logistic regression 
        use batch_size data points to compute gradient in each step.
   
        Args:
           X: np.array shape (n,d) dtype float32 - Features 
           y: np.array shape (,n) dtype int32 - Labels 
           w: np.array shape (,d) dtype float32 - Initial parameter vector
           lr: scalar - learning rate for gradient descent
           batch_size: number of elements to use in minibatch
           epochs: Number of scans through the data

        sets: 
           w: numpy array shape (,d) learned weight vector w
           history: list/np.array len epochs
        """
        if w is None: w = np.zeros(X.shape[1])
        history = []        
        n = np.size(X, 0)
        for i in range(epochs):
            b = batch_size
            X_ = np.copy(X)
            X_shuf = np.take(X_,np.random.permutation(X_.shape[0]),axis=0,out=X_)
            for i in range(n//b):
                sample = X_shuf[b*i:(i+1)*b]
                g = (1/b)*sum([-y[i]*sample[i,:]*sigmoid(-y[i]*np.dot(w,sample[i,:])) for i in range(b)])
                w = np.array(w - lr*g)
            history.append(w)
        self.w = w
        self.history = history
        return w


    def predict(self, X):
        """ Classify each data element in X

        Args:
            X: np.array shape (n,d) dtype float - Features 
        
        Returns: 
           p: numpy array shape (n, ) dtype int32, class predictions on X (0, 1)

        """
        z = np.dot(X,self.w.T)
        print(z)
        out = logistic(z)
        return out
    
    def score(self, X, y):
        """ Compute model accuracy  on Data X with labels y

        Args:
            X: np.array shape (n,d) dtype float - Features 
            y: np.array shape (n,) dtype int - Labels 

        Returns: 
           s: float, number of correct prediction divivded by n.

        """
        s = 0
        n = np.size(X,0)
        pred = self.predict(X)
        pred_labels = []
        for i in range(n):
            if pred[i] > 0.5:
                pred_labels += [1]
            if pred[i] <= 0.5:
                pred_labels += [-1]
        for i in range(n):
            if pred_labels[i] == y[i]:
                s += 1
        return s / n
```

【问题讨论】：

标签： python machine-learning regression gradient-descent

【解决方案1】：

您忘记在训练数据旁边打乱标签。如果你有

[3, 1] [-1] 
[2, 3] [ 1]

对训练数据进行混洗后，标签会不匹配

[2, 3] [-1] 
[3, 1] [ 1]

【讨论】：