图上的迁移学习（三）

图上的迁移学习(三）

Unsupervised Domain Adaptive Graph Convolution Networks——UDA-GCN算法笔记

Unsupervised Domain Adaptive Graph Convolution Networks——UDA-GCN算法笔记

基本思想

目前解决 cross-network node classification 问题的方法存在的问题：
(1) GCN 缺少全局一致性的把握。
(2) 如何合理地将全局信息和局部信息结合。
(3) domain adaptation 中没有注重 target network 中信息的使用。

本文提出 unsupervised domain adaptive graph convolution networks(UDA-GCN) 通过建模图上 local and global consistency relationship & 组合 source information, domain information 和 target information 到一个统一的深度学习框架中解决 cross-network node classification 问题。 UDA-GCN 主要包括三个部分：
(1) 在数据结构层面，在训练节点嵌入时，使用一个对偶图卷积网络元素捕捉图的 local and global consistency relationship。
(2) 在表达学习层面，an inter-graphed based attention mechanism 被提出，来组合 local and global representation.
(3) 在domain adaptive learning 层面，提出一种 adaptive learning approach 联合利用 source information, domain information and target information，有效地学习 domain-invariant and semantic representation。

符号说明

图上的迁移学习（三）

算法框架

图上的迁移学习（三）
经过简化的算法框架如下：

节点嵌入模型
Local Consistency Network ( C o n v A Conv_A ConvA).
使用 GCN 网络。网络第 i i i 层的输出
C o n v A ( i ) ( X ) = Z ( i ) = σ ( D ~ − 1 2 A ~ D ~ − 1 2 Z ( i − 1 ) W ( i ) ) Conv_A^{(i)}(X)=Z^{(i)}=\sigma (\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} Z^{(i-1)} W^{(i)} ) ConvA(i)(X)=Z(i)=σ(D~−21A~D~−21Z(i−1)W(i))
Z ( 0 ) = X Z^{(0)}=X Z(0)=X。
得到包含局部信息的嵌入矩阵 Z A s Z_{A}^{s} ZAs， Z A t Z_{A}^{t} ZAt。
Global Consistency Network ( C o n v p Conv_p Convp).
首先构建 PPMI matrix P P P 来编码全局信息。然后神经网络的迭代公式：
C o n v p ( i ) ( X ) = Z ( i ) = σ ( D − 1 2 P D − 1 2 Z ( i − 1 ) W ( i ) ) Conv_p^{(i)}(X)=Z^{(i)}=\sigma (D^{-\frac{1}{2}} P D^{-\frac{1}{2}} Z^{(i-1)} W^{(i)} ) Convp(i)(X)=Z(i)=σ(D−21PD−21Z(i−1)W(i))
其中 D i , i = ∑ j P i , j D_{i,i}=\sum_{j} P_{i,j} Di,i=∑jPi,j Z ( 0 ) = X Z^{(0)}=X Z(0)=X。
得到包含全局信息的嵌入矩阵 Z p s Z_{p}^{s} Zps， Z p t Z_{p}^{t} Zpt。
Inter-Graph Attention
将 source network 和 target network 输入到节点嵌入模型，得到四个嵌入矩阵 Z A s Z_{A}^{s} ZAs， Z p s Z_{p}^{s} Zps， Z p s Z_{p}^{s} Zps， Z p t Z_{p}^{t} Zpt。我们使用 attention mechanism 聚合从不同网络产生的嵌入，得到一个统一的 representation。
使用 X s X^s Xs 和 X t X^t Xt 作为 key of attention mechanism，两个 attention 系数 a t t A k att_A^k attAk 和 a t t p k att_p^k attpk 被计算：
a t t A k = f ( Z A k , J X k ) att_A^k= f(Z_A^k, JX^k) attAk=f(ZAk,JXk)
a t t p k = f ( Z p k , J X k ) att_p^k=f(Z_p^k, JX^k) attpk=f(Zpk,JXk)
其中 k ∈ { s , t } k \in \{s,t\} k∈{s,t}，J是 shared weight matrix 使得 X k X^k Xk与输出 Z A k Z_A^k ZAk 和 Z p k Z_p^k Zpk 有相同的维数。
接下来，使用 softmax layer 来 normalize weight:
a t t A k = e x p ( a t t A k ) e x p ( a t t A k + a t t p l ) att_A^k=\frac{exp(att_A^k)}{exp(att_A^k+att_p^l)} attAk=exp(attAk+attpl)exp(attAk)
a t t p k = e x p ( a t t p k ) e x p ( a t t A k + a t t p l ) att_p^k=\frac{exp(att_p^k)}{exp(att_A^k+att_p^l)} attpk=exp(attAk+attpl)exp(attpk)
最终的输出 Z s Z^s Zs 和 Z t Z^t Zt 为：
Z s = a t t A s Z A s + a t t p s Z p s Z^s= att_A^s Z_A^s + att_p^s Z_p^s Zs=attAsZAs+attpsZps
Z t = a t t A t Z A t + a t t p T Z p t Z^t= att_A^t Z_A^t + att_p^T Z_p^t Zt=attAtZAt+attpTZpt
Domain Adaptive learning for Cross-Domain Node Classification
Source Classifier
对 source network 中的节点进行分类，并且用 cross-entropy loss。

Domain Classifier
判断 node representation 来自哪个 domain，使得两个网络的特征表达分布接近。在adversarial training 中使用 GRL。使用 cross-entropy loss。

Target Classifier
对于 target classifier f t f_t ft 使用 entropy loss。

联合优化
overall loss function 如下：

通过标准的反向传播算法优化 over all loss function，联合优化参数。算法如下：

实验

实验数据集为经典的三篇引文网络。并且在实验中与 AdaGCN进行比较，算法性能有所提高。