2. Method

学习的目标是an unconditional generative model that captures the internal statistics of a single training image xx

不同于纹理生成(texture generation),本文针对的图像都是general natural images

2.1. Multi-scale architecture

SinGAN: Learning a Generative Model from a Single Natural Image(ICCV19)
对于输入图像xx的pyramid {x0,,xN}\left \{ x_0,\cdots,x_N \right \},对应各自的生成器{G0,,GN}\left \{ G_0,\cdots,G_N \right \},其中xnx_n是将xx尺寸缩小rnr^n倍的图像,r>1r\gt1是一个超参数,每一个GnG_n对应一个判别器DnD_n

训练首先从xNx_N这一尺寸开始,GNG_N将高斯白噪声zNz_N转换为图像x~N\tilde{x}_N
x~N=GN(zN)(1) \tilde{x}_N=G_N(z_N) \qquad(1)
x~N\tilde{x}_N包含了图像的general layout以及object的global structure,后续的Gn(n<N)G_n(n\lt N)逐渐地增加各种细节

如Figure 5所示,GnG_n接收的输入有2个,1是高斯白噪声znz_n,2是上一个尺度生成图像的上采样版本(x~n+1)r\left ( \tilde{x}_{n+1} \right )\uparrow^r
x~n=Gn(zn,(x~n+1)),n<N(2) \tilde{x}_n=G_n\left ( z_n, \left ( \tilde{x}_{n+1} \right )\uparrow \right ), \quad n\lt N \qquad(2)
SinGAN: Learning a Generative Model from a Single Natural Image(ICCV19)
更具体来说,GnG_n执行的操作如下,是一种残差的操作
x~n=(x~n+1)r+ψn(zn+(x~n+1)r)(3) \tilde{x}_n=\left ( \tilde{x}_{n+1} \right )\uparrow^r+\psi_n\left ( z_n+\left ( \tilde{x}_{n+1} \right )\uparrow^r \right ) \qquad(3)
其中ψn\psi_n是一个ConvNet,包含了5个block,每个block是Conv(3x3)-BatchNorm-LeakyReLU

2.2. Training

训练是从coarsest scale到finest scale,每一个GAN在训练好之后,就保持fixed状态

对于第nn个GAN,损失函数包括adversarial term以及reconstruction term
minGn maxDn Ladv(Gn,Dn)+αLrec(Gn)(4) \underset{G_n}{\min}\ \underset{D_n}{\max}\ \mathcal{L}_{adv}(G_n,D_n)+\alpha\mathcal{L}_{rec}(G_n) \qquad(4)

Adversarial loss
使用WGAN-GP loss

Reconstruction loss
必须保证存在一组noise,能够重构出原始图像xx
因此事先选取一组{zNrec,zN1rec,,z0rec}={z,0,,0}\left \{ z_N^{rec},z_{N-1}^{rec},\cdots,z_0^{rec} \right \}=\left \{ z^*,0,\cdots,0 \right \},生成得到{x~Nrec,x~N1rec,,x~0rec}\left \{ \tilde{x}_N^{rec},\tilde{x}_{N-1}^{rec},\cdots,\tilde{x}_0^{rec} \right \}

于是对于n<Nn\lt N
Lrec=Gn(0,(x~n+1rec)r)xn2(5) \mathcal{L}_{rec}=\left \| G_n\left ( 0,\left ( \tilde{x}_{n+1}^{rec} \right )\uparrow^r \right ) -x_n\right \|^2 \qquad(5)
对于n=Nn=NLrec=GN(z)xN2\mathcal{L}_{rec}=\left \| G_N(z^*)-x_N \right \|^2

相关文章: