emm我只是总结分享下论文阅读体验,原创转载翻译好像都不太合适,但是转载翻译还需要授权就选了原创,如果侵权会转成私人可见的。

Addressed Problem

This work addresses the aforementioned research problems by formalizing a neural network modelling approach for collaborative filtering. We focus on implicit feedback, which indirectly reflects users’ preference through behaviours like watching videos, purchasing products and clicking items.

  • explicit feedback (i.e., ratings and reviews)
  • implicit feedback indirectly reflects users’ preference through behaviours like watching videos, purchasing products and clicking items.

implicit feedback can be tracked automatically and is thus much easier to collect for content providers.

Problem Formulation

MM: number of users
NN: number of items
YRMNY \in \mathbb{R}^{M*N} :user–item interaction matrix.
Neural Collaborative Filtering
Here a value of 1 for yuiy_{ui} indicates that there is an interaction between user uu and item ii; however, it does not mean uu actually likes ii. Similarly, a value of 0 does not necessarily mean uu does not like ii, it can be that the user is not aware of the item.

Notice: While observed entries at least reflect users’ interest on items, the unobserved entries can be just missing data and there is a natural scarcity of negative feedback.

The recommendation problem with implicit feedback is formulated as the problem of estimating the scores of unobserved entries in Y, which are used for ranking the items.

Goal

Learn y^ui=f(u,iΘ)\hat{y}_{ui}=f(u,i|\Theta)
y^ui\hat{y}_{ui}:the predicted score of interaction yuiy_{ui}
Θ\Theta:model parameters
ff: function that maps model parameters to the predicted score (which we term as an interaction function). (NN model)

Related Work

Two types of objective function

  • pointwise loss: natural extension of abundant work on explicit feedback, methods on pointwise learning usually follow a regression framework by minimizing the squared loss between y^ui\hat{y}_{ui} and its target value yuiy_{ui}.
    Lsqr=(u,i)YYwui(y^uiyui)2 L_{sqr}= \sum_{(u,i) \in \mathcal{Y} \cup \mathcal{Y}^-} w_{ui}(\hat{y}_{ui}-y_{ui})^2
    YY denotes the set of observed interactions in Y\mathcal{Y}, and Y\mathcal{Y}^- denotes the set of negative instances, which can be all (or sampled from) unobserved interactions; and wuiw_{ui} is a hyperparameter denoting the weight of training instance (u,i)(u,i).

  • pairwise loss: the idea is that observed entries should be ranked higher than the unobserved ones. As such, instead of minimizing the loss between y^ui\hat{y}_{ui} and yuiy_{ui}, pairwise learning maximizes the margin between observed entry y^ui\hat{y}_{ui}and unobserved entry y^ui\hat{y}_{ui}.

Proposed Loss
In what follows, we present a probabilistic approach for learning the pointwise NCF that pays special attention to the binary property of implicit data.
(Simply maximize log likehood)
Neural Collaborative Filtering

Matrix Factorization

MF associates each user and item with a real-valued vector of latent features.
pup_u: latent vector for user uu
qiq_i: latent vector for item ii
KK denotes the dimension of the latent space
y^ui=f(u,ipu,qi)=puTqi=k=1Kpukqik\hat{y}_{ui} =f(u,i|p_u,q_i)=p_u^Tq_i=\sum_{k=1}^{K}p_{uk}q_{ik}

Drawback:
MF can be deemed as a linear model of latent factors.
Neural Collaborative Filtering
we use the Jaccard coefficient as the groundtruth similarity of two users that MF needs to recover.
Neural Collaborative Filtering
Let us first focus on the first three rows (users) in Figure 1a. It is easy to have s23(0.66) > s12(0.5) > s13(0.4). As such, the geometric relations of p1,p2, and p3 in the latent space can be plotted as in Figure 1b. Now, let us consider a new user u4, whose input is given as the dashed line in Figure 1a. We can have s41(0.6) > s43(0.4) > s42(0.2), meaning that u4 is most similar to u1, followed by u3, and lastly u2. However, if a MF model places p4 closest to p1 (the two options are shown in Figure 1b with dashed lines), it will result in p4 closer to p2 than p3, which unfortunately will incur a large ranking loss.

NEURAL COLLABORATIVE FILTERING

Neural Collaborative Filtering

  • Since this work focuses on the pure collaborative filtering setting, we use only the identity of a user and an item as the input feature, transforming it to a binarized sparse vector with one-hot encoding. Note that with such a generic feature representation for inputs, our method can be easily adjusted to address the cold-start problem by using content features to represent users and items.
  • Above the input layer is the embedding layer; it is a fully connected layer that projects the sparse representation to a dense vector.
  • The user embedding and item embedding are then fed into a multi-layer neural architecture

Generalized Matrix Factorization(GMF)

Neural Collaborative Filtering
GMF(left hand side): same as mf(latent vector comes from FC embedding)

Pre-training

  • initialization plays an important role for the convergence and performance of deep learning models.
  • we propose to initialize NeuMF using the pretrained models of GMF and MLP.
  • We first train GMF and MLP with random initializations until convergence. We then use their model parameters as the initialization for the corresponding parts of NeuMF’s parameters.

相关文章: