使用 tensorflow 2 执行矩阵分解的简单方法答案

【问题标题】：Simple way of performing Matrix Factorization with tensorflow 2使用 tensorflow 2 执行矩阵分解的简单方法
【发布时间】：2021-02-12 18:02:27
【问题描述】：

我一直在寻找如何为我将展示的这个非常简单和基本的案例执行矩阵分解，但没有找到任何东西。我只找到了复杂而冗长的解决方案，所以我将提出我想要解决的问题：

U x V = A

我只想知道如何在 Tensorflow 2 中求解这个方程，作为 A 一个已知的稀疏矩阵，以及 U 和 V两个随机初始化的矩阵。所以我想求U和V，让它们的乘积大约等于A。

例如，拥有这些变量：


# I use this function to build a toy dataset for the sparse matrix
def build_rating_sparse_tensor(ratings):

  indices = ratings[['U_num', 'V_num']].values 

  values = ratings['rating'].values

  return tf.SparseTensor(
                indices=indices,
                values=values,
                dense_shape=[ratings.U_num.max()+1, ratings.V_num.max()+1])

# here I create what will be the matrix A
ratings = (pd.DataFrame({'U_num': list(range(0,10_000))*30,
                        'V_num': list(range(0,60_000))*5,
                        'rating': np.random.randint(6, size=300_000)})
                       .sample(1000)
                       .drop_duplicates(subset=['U_num','V_num'])
                       .sort_values(['U_num','V_num'], ascending=[1,1]))


# Variables

A = build_rating_sparse_tensor(ratings)

U = tf.Variable(tf.random_normal(
        [A_Sparse.shape[0], embeddings], stddev=init_stddev))

# this matrix would be transposed in the equation
V = tf.Variable(tf.random_normal(
        [A_Sparse.shape[1], embeddings], stddev=init_stddev))


# loss function
def sparse_mean_square_error(sparse_ratings, user_embeddings, movie_embeddings):

  predictions = tf.reduce_sum(
                    tf.gather(user_embeddings, sparse_ratings.indices[:, 0]) *
                    tf.gather(movie_embeddings, sparse_ratings.indices[:, 1]),
                    axis=1)
  loss = tf.losses.mean_squared_error(sparse_ratings.values, predictions)
  return loss

是否可以使用特定的损失函数、优化器和学习计划来做到这一点？

非常感谢。

【问题讨论】：

A 稀疏这一事实对您来说有多重要？在稀疏矩阵上计算梯度很复杂。您还应该知道 tensorflow 不支持在整数上计算梯度 (github.com/tensorflow/tensorflow/issues/20524)
@Lescurel 假设 A 是一个 10.000 x 500.000 的矩阵。我认为由于性能和 RAM 容量，它必须是稀疏的。但如果我错了，A 不一定是稀疏的。
不，对于 RAM 的使用，像这个一样空的矩阵肯定是稀疏的。但这使问题变得更加困难。计算 U 和 V 上的梯度的简单训练循环非常简单，如果您有兴趣，我可以将其发布为答案，但此解决方案无法利用稀疏性。

标签： python tensorflow tensorflow2.0 recommendation-engine matrix-factorization

【解决方案1】：

使用 TensorFlow 2 的简单直接的方法：

请注意，评级已转换为 float32。 TensorFlow 无法计算整数上的梯度，请参阅https://github.com/tensorflow/tensorflow/issues/20524。

A = build_rating_sparse_tensor(ratings)
indices = ratings[["U_num", "V_num"]].values
embeddings = 3000

U = tf.Variable(tf.random.normal([A.shape[0], embeddings]), dtype=tf.float32)
V = tf.Variable(tf.random.normal([embeddings, A.shape[1]]), dtype=tf.float32)

optimizer = tf.optimizers.Adam()

trainable_weights = [U, V]

for step in range(100):
    with tf.GradientTape() as tape:
        A_prime = tf.matmul(U, V)
        # indexing the result based on the indices of A that contain a value
        A_prime_sparse = tf.gather(
            tf.reshape(A_prime, [-1]),
            indices[:, 0] * tf.shape(A_prime)[1] + indices[:, 1],
        )
        loss = tf.reduce_sum(tf.metrics.mean_squared_error(A_prime_sparse, A.values))
    grads = tape.gradient(loss, trainable_weights)
    optimizer.apply_gradients(zip(grads, trainable_weights))
    if step % 20 == 0:
        print(f"Training loss at step {step}: {loss:.4f}")

我们通过仅计算 A 的实际值的损失来利用 A 的稀疏性。但是，我们仍然必须为可训练的权重分配两个非常大的密集张量 U 和 V。对于示例中的大数字，您可能会遇到一些 OOM 错误。

也许值得为您的数据探索另一种表示形式。

【讨论】：

您好，感谢您的回答！实际上，我有一个可用的损失函数（我会将它添加到主帖中，因为我无法在此评论中添加代码）。我认为它会表现得更好，因为它不计算 tf.matmul(U, V)。