Skip to content

asineesh/Edge-Preserved-Universal-Pooling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge-Preserved-Universal-Pooling

This work relooks the way pooling is done in CNNs. An exhaustive analysis of the edge preserving pooling options for classification, segmentation and autoencoders has been done. Two novel pooling approaches are presented, namely the Laplacian-Gaussian Concatenation with Attention (LGCA) pooling and Wavelet based approximate-detailed coefficient concatenation with attention (WADCA) pooling. The results suggest that the proposed pooling approaches outperform the conventional pooling as well as blur pooling for classification, segmentation and autoencoders. In terms of average binary classification accuracy (cats vs dogs), the proposed LGCA approach outperforms the normal pooling and blur pooling by 4% and 2%, 3% and 4%, 3% and 0.5% for MobileNetv2, DenseNet121 and ResNet50 respectively. On the other hand, the proposed WADCA approach outperforms the normal pooling and blur pooling by 5% and 3%, 2% and 3%, 2 and 0.17% for MobileNetv2, DenseNet121 and ResNet50 respectively. It is also observed from results that edge preserving pooling doesn’t have any significance in segmentation tasks possibly due to high resolution to low resolution translation, whereas for convolutional auto encoders high resolution reconstruction has been observed for the LGCA pooling.

Laplacian-Gaussian Concatenation with Attention (LGCA)


In the LGCA pooling approach, Gaussian and Laplacian filtering operations are performed on the input feature maps and are concatenated and passed through an attention network. Since the channel dimension is doubled during the concatenation operation, the output from the attention framework is passed through a convolution layer to reduce to the original dimension as the feature map. The attention network is used to remove the redundancy among channels and to bring the focus of the overall network to the most relevant channels in the concatenated output. The architecture for the LGCA approach is as shown above.

Implementation for Replacing Traditional Max Pooling

model.maxpool = ConMax(inc, device, size)

where,
inc is the number of input channels
device is the object representing the device on which a torch.Tensor is allocated
size is the Gaussian Kernel size

Implementation for Replacing Traditional Average Pooling

model.avgpool = ConAvg(inc, device, size)

where,
inc is the number of input channels
device is the object representing the device on which a torch.Tensor is allocated
size is the Gaussian Kernel size

Implementation for Replacing Traditional Strided Convolution if its followed by ReLU
This

model.conv = nn.Conv2d(inc,outc,filter,stride,padding)
model.relu = nn.ReLU()

can be replaced by

model.conv = nn.Conv2d(inc,outc,filter,stride-1,paddingn)
model.relu = ConConv(outc,device,size)

where,
inc is the number of input channels
outc is the number of output channels
device is the object representing the device on which a torch.Tensor is allocated
filter is the filter size
size is the Gaussian Kernel size
paddingn is equal to 1 if padding is 1, else equal to padding -1

Implementation for Replacing Traditional Strided Convolution if its followed by Batch Normalization
This

model.conv = nn.Conv2d(inc,outc,filter,stride,padding)
model.bn = nn.BatchNorm2d()

can be replaced by

model.conv = nn.Conv2d(inc,outc,filter,stride-1,paddingn)
model.relu = ConConvb(outc,device,size)

where,
inc is the number of input channels
outc is the number of output channels
device is the object representing the device on which a torch.Tensor is allocated
filter is the filter size
size is the Gaussian Kernel size
paddingn is equal to 1 if padding is 1, else equal to padding -1

Implementation for Replacing 1x1 Strided Convolution if its followed by Batch Normalization
This

model.conv = nn.Conv2d(inc,outc,1,stride)
model.bn = nn.BatchNorm2d()

can be replaced by

model.conv = nn.Conv2d(inc,outc,1,stride-1)
model.relu = ConConv1(outc,device)

where,
inc is the number of input channels
outc is the number of output channels
device is the object representing the device on which a torch.Tensor is allocated

Wavelet based approximate-detailed coefficient concatenation with attention (WADCA)


In the WADCA approach, 2D DWT based on Haar wavelet has been employed for the decomposition of the low and high frequency components. The high frequency components are zeroed and 2D-IDWT is employed with concatenating the approximate and detailed coefficient with zeros. Then, like in the case of LGCA, the feature maps are passed through an attention network followed by a convolution layer. The architecture for the WADCA approach is as shown above.

Implementation for Replacing Traditional Max Pooling

model.maxpool = HaarMax(inc, device)

where,
inc is the number of input channels
device is the object representing the device on which a torch.Tensor is allocated

Implementation for Replacing Traditional Average Pooling

model.avgpool = HaarAvg(inc, device)

where,
inc is the number of input channels
device is the object representing the device on which a torch.Tensor is allocated

Implementation for Replacing Traditional Strided Convolution if its followed by ReLU
This

model.conv = nn.Conv2d(inc,outc,filter,stride,padding)
model.relu = nn.ReLU()

can be replaced by

model.conv = nn.Conv2d(inc,outc,filter,stride-1,paddingn)
model.relu = HaarConv(outc,device)

where,
inc is the number of input channels
outc is the number of output channels
filter is the filter size
device is the object representing the device on which a torch.Tensor is allocated
paddingn is equal to 1 if padding is 1, else equal to padding -1

Implementation for Replacing Traditional Strided Convolution if its followed by Batch Normalization
This

model.conv = nn.Conv2d(inc,outc,filter,stride,padding)
model.bn = nn.BatchNorm2d()

can be replaced by

model.conv = nn.Conv2d(inc,outc,filter,stride-1,paddingn)
model.relu = HaarConvb(outc,device)

where,
inc is the number of input channels
outc is the number of output channels
filter is the filter size
device is the object representing the device on which a torch.Tensor is allocated
paddingn is equal to 1 if padding is 1, else equal to padding -1

Implementation for Replacing 1x1 Strided Convolution if its followed by Batch Normalization
This

model.conv = nn.Conv2d(inc,outc,1,stride)
model.bn = nn.BatchNorm2d()

can be replaced by

model.conv = nn.Conv2d(inc,outc,1,stride-1)
model.relu = HaarConv1(outc,device)

where,
inc is the number of input channels
outc is the number of output channels
device is the object representing the device on which a torch.Tensor is allocated

Demonstration of Densenet121 with different pooling layers using GradCAM

Original Image

With Traditional Pooling

With LGCA

With WADCA

Releases

No releases published

Packages

 
 
 

Languages