Pix2Pix GAN(CVPR. 2017)
2021/6/28 23:21:43
本文主要是介绍Pix2Pix GAN(CVPR. 2017),对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
![image-20210627163327806](/upload/202106/28/202106282321370101.png)
1. Motivation
Image-to-Image translation的定义
- We define automatic
image-to-image translation
as the task of translating one possible representation of a scene into another. - Our goal in this paper is to develop a common framework for all these problems.
需要认真设计loss函数,因为如果只用欧氏距离的方法,容易造成生成的图片blurry results.
- This is because Euclidean distance is minimized by averaging all plausible outputs, which causes blurring.
- Earlier papers have focused on specific applications, and it has remained unclear how effective image-conditional GANs can be as a general-purpose solution for image-to- image translation.
2. Contribution
- 本文的第一个贡献在于CGAN在多任务上可以统一,有不错的效果。
- Our primary contribution is to demonstrate that on a wide variety of problems, conditional GANs produce reasonable results.
- 本文的第二个贡献在于提出了一个简单的框架。
- Our second contribution is to present a simple framework sufficient to achieve good results, and to analyze the effects of several important architectural choices.
与之前的工作不同的是,Pix2Pix GAN在G中使用U-Net,并且在D中使用PatchGAN Classifier。
- Unlike past work, for our generator we use a “U-Net”-based architecture.
- And for our discriminator we use a convo- lutional “PatchGAN” classifier, which only penalizes struc- ture at the scale of image patches.
3. Method
![image-20210628193929793](/upload/202106/28/202106282321373246.png)
3.1 Objective
CGAN可以表示为公式1,其中x为condition:
![image-20210628194244007](/upload/202106/28/202106282321375760.png)
最后的目标函数表示为:
![image-20210628194949001](/upload/202106/28/202106282321378290.png)
对于noise z的设定,作者采取了dropout的方式:
- Instead, for our final models, we provide noise only in the form of dropout, applied on several layers of our generator at both training and test time.
![image-20210628194928724](/upload/202106/28/202106282321383476.png)
3.2 Network architecture
对于输入和输出图片来说,可以额理解为在surface appearance不同,但是具有相同的underlying structure渲染。
- In addition, for the problems we consider, the input and output differ in surface appearance, but both are renderings of the same underlying structure.
- Therefore, structure in the input is roughly aligned with structure in the output.
对于generation的制作来说,作者参考U-Net 使用了skip connection结构:
- To give the generator a means to circumvent the bottleneck for information like this, we add skip connections, fol- lowing the general shape of a “U-Net”.
3.3 Markovian discriminator (PatchGAN)
作者指出虽然L1 L2loss会使得生产的图片具有blurry模糊性质,无法捕获高频特征,但是可以精确的捕获低频的特征。这样就只需要GAN Discriminator建模高频的结构,使用L1 来建模低频。
- Although these losses fail to encourage high-frequency crispness, in many cases they nonetheless accu- rately capture the low frequencies.
那么对于制定一个建模高频的结构,在局部image patch中限制attention是有效的。因此作者制定了一个PatcchGAN,将图片分为NxN个patch,对于patches进行penalize,判断每一个patch是real还是fake。
4. Experiment
4.1 Dataset
![image-20210628212414111](/upload/202106/28/202106282321389286.png)
4.2 Evaluation metrics
-
We employ two tactics. First, we run “real vs. fake” perceptual studies on Amazon Mechanical Turk (AMT).
-
Second, we measure whether or not our synthesized cityscapes are realistic enough that off-the-shelf recognition system can recognize the objects in them.
-
AMT perceptual studies
-
“FCN-score”
4.3. Analysis of the generator architecture A
![image-20210628213539180](/upload/202106/28/202106282321396732.png)
![image-20210628213533548](/upload/202106/28/202106282321401317.png)
4.4 Analysis of the objective function
![image-20210628212654986](/upload/202106/28/202106282321413663.png)
4.5 FromPixelGANs to PatchGANs to ImageGANs
![image-20210628214924897](/upload/202106/28/202106282321416347.png)
![image-20210628223522272](/upload/202106/28/202106282321421278.png)
4.6. Semantic segmentation
![image-20210628220151258](/upload/202106/28/202106282321423573.png)
![image-20210628223803959](/upload/202106/28/202106282321431547.png)
这篇关于Pix2Pix GAN(CVPR. 2017)的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-07-03微信支付提示下单账户与支付账户不一致-icode9专业技术文章分享
- 2024-07-03微信支付提示订单号重复-icode9专业技术文章分享
- 2024-07-02微服务启动nacos注册上去了,但是一直没有收到请求-icode9专业技术文章分享
- 2024-07-02如何检查文件的编码格式-icode9专业技术文章分享
- 2024-07-02sublime 更改编码格式-icode9专业技术文章分享
- 2024-06-30uniAPP 实现全屏左右滚动滚动的效果-icode9专业技术文章分享
- 2024-06-30如何在本地使用授权或插件-icode9专业技术文章分享
- 2024-06-30伪静态规则配置方法汇总-icode9专业技术文章分享
- 2024-06-29易优CMS安装常见问题汇总-icode9专业技术文章分享
- 2024-06-28易优新手必读安装教程-icode9专业技术文章分享