Global Context-Aware Progressive Aggregation Network for Salient Object Detection Notes
2021/11/9 6:10:00
本文主要是介绍Global Context-Aware Progressive Aggregation Network for Salient Object Detection Notes,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
Global Context-Aware Progressive Aggregation Network for Salient Object Detection
Facts
- due to the pyramid-like CNNs structure, high-level features help locate the salient objects roughly, low-level features help refine boundaries.
- traditional methods like FCN-based methods just simply combined semantic information and appearance information, which is not sufficient and lacks consideration for different contribution of different features.
- Most of previous works ignored the global-context information, which can tell the relationship among multiple salient regions. Let's take the figure of ping-pong girl for example, most of other methods pay attention to the ping-pong bat while ignoring the ping-pong ball, which is related to the bat.
Structure
The GCPANet consists of four parts
- FIA (Feature Interweaved Aggregation)
- SR (Self Refinement)
- HA(Head Attention)
- GCF (Global Context Flow)
Feature Interweaved Aggregation
Benefits
Combine low-level features and high-level features. 取长补短
Additionally use global context information to help understand the relationship between different objects (ping-pong ball for example), which is beneficial in generate more complete and accurate saliency map.
What's more, global context information helps alleviate the effect of feature dilution.
Function
To fully integrate the three mentioned features.
Implementation
High & Low Level Features
To better fuse up-sampled high-level features with low-level features, the paper suggests we should use multiplication instead of concatenation, which helps to strengthen the response of salient objects and to suppress the background noise.
To be specific, here is what the paper tells us
\[\mathbf W^t_h = upsample(conv_2(\mathbf f^t_h)) \]\[\mathbf f^t_{hl}=\delta(\mathbf W^t_h\odot \mathbf{ \widetilde f_l^t }) \]\[\mathbf W_l^t = conv_3(\mathbf{\widetilde f_l^t}) \]\[\mathbf f_{lh}^t = \delta(\mathbf W_l^t\odot upsample(\mathbf f_h^t)) \]Global Context Features
Introduce the global context features \(\mathbf f_{g}^t\) at each stage.
\[\mathbf W_g^t=upsample(conv_4(f_g^t)) \]\[\mathbf f_{gl}^t=\delta(\mathbf W_g^t \odot \mathbf{\widetilde f_l^t}) \]Output
Concatenate the three features and pass them through a \(3\times 3\) convolution layer to obtain the output.
\[\mathbf f_a^t = conv_5(concat(\mathbf f_{hl}^t,\mathbf f_{lh}^t,\mathbf f_{hl}^t)) \]Self Refinement
Function
To reduce the contradictory response of different layers.
Implementation
\[\mathbf{\widetilde f} = conv_6(\mathbf f_{in}) \]\[\mathbf f_{out} = \delta(\mathbf W\odot \mathbf{\widetilde f}+b) \]Head Attention (HA)
Function
To select important and representative features from the output of the top layers, which usually contains much redundant information.
Location
As is mentioned above, it locates following the top layers to process the output of the first layers.
Implementation
-
Apply a convolution layer to the input feature maps \(\mathbf F\) to obtain a compressed feature representation \(\mathbf{\widetilde F}\) with 256 channels.
-
Generate a mask \(\mathbf W\) and bias \(\mathbf{b}\), then we get
\[\mathbf {F_1 = \delta(W\oplus \widetilde F+b)} \]where \(\delta\) represents to the ReLU activation function
-
Use average pooling to down-sample \(\mathbf F\) into channel-wise feature vector \(\mathbf f\)
-
Apply 2 successive fully connected layers to \(\mathbf f\) and get an output vector \(\mathbf y\)
-
Get final output vector \(\mathbf F_{out} = \mathbf F_1 \odot \mathbf y\)
Global Context Flow (GCF)
Function
To better understand the relationship between different salient objects, and to alleviate the effect of feature dilution.
Implementation
\[\mathbf y^t = \sigma \circ fc_4 \circ \delta \circ fc_3 (\mathbf f_{gap}) \]\[\mathbf{\widetilde f}^t = conv_{10}(\mathbf f_{top}) \]\[\mathbf f_g^t = \mathbf{\widetilde f}^t \odot \mathbf y^t \]Results
Outperform other 12 stage-of-the-art methods on 6 benchmark datasets.
Perform ablation study to prove the effectiveness of the four main part of GCPANet.
My Experiments
I use BJTU HPC platform to run the code.
So many troubles :<
-
Had trouble trying to SSH to the server
sol: The platform supports WinSCP, which can pass password to PuTTY. So I can SSH to the server indirectly.
-
Fail to pass parameter to test.py due to the restriction of BJTU HPC platform.
sol: Replace
sys.argv[1]
with the parameter I'm trying to pass. A better solution would be writing a start.py. -
Fail to locate files since working path is redirected to "jobs/xxx"
sol: Add
os.chdir('/data/home/u20281202/SOD/GCPANet-master/')
at the beginning of test.py. -
Fail to load the ResNet50
sol: Upload resnet50-19c8e357.pth to model folder and modify the initialize function.
def initialize(self): self.load_state_dict(torch.load('./model/resnet50-19c8e357.pth'), strict=False)
(I missed the beginning dot when I thought I was going to successfully run it, only to find the MAE was way too large.)
Obversions
这篇关于Global Context-Aware Progressive Aggregation Network for Salient Object Detection Notes的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-07-07Dify + TiDB Vector,快速构建你的AI Agent
- 2024-07-06有没有什么开源的py项目可以对图像进行分类-icode9专业技术文章分享
- 2024-07-05feign默认connecttimeout和readtimeout是多少-icode9专业技术文章分享
- 2024-07-05idea控制台,日志太多,导致部分想看得日志被刷走 搜不到-icode9专业技术文章分享
- 2024-07-05The server selected protocol version Tls10 is not accepted by client preferences [TLs12]-icode9专业技术文章分享
- 2024-07-05怎么清理项目缓存-icode9专业技术文章分享
- 2024-07-04安装 Eyoucms详细图文教程-icode9专业技术文章分享
- 2024-07-04ueditor 复制文章时,图片的链接是一个下载图片地址,该如何处理?-icode9专业技术文章分享
- 2024-07-04怎样判断host有没有对wordpress有缓存呢-icode9专业技术文章分享
- 2024-07-04具有编译功能的系统make后,无法ssh连接-icode9专业技术文章分享