MAX POOLING (2024)

The pooling operation involves sliding a two-dimensional filter over each channel of feature map and summarising the features lying within the region covered by the filter.
For a feature map having dimensions nh x nw x nc, the dimensions of output obtained after a pooling layer is

(nh - f + 1)/s *(nw - f+ 1)/s *nc

where

 nh - height of feature map
nw - width of feature map
nc - number of channels in the feature map
f - size of filter
s - stride length

A common CNN model architecture is to have a number of convolution and pooling layers stacked one after the other.

Pooling Layers?

  • Pooling layers are used to reduce the dimensions of the feature maps. Thus, it reduces the number of parameters to learn and the amount of computation performed in the network.
  • The pooling layer summarises the features present in a region of the feature map generated by a convolution layer. So, further operations are performed on summarised features instead of precisely positioned features generated by the convolution layer. This makes the model more robust to variations in the position of the features in the input image.

Types of Pooling:

  1. MaxPooling
  2. Average Pooling
  3. Global Pooling

Max Pooling

Max pooling is a pooling operation that selects the maximum element from the region of the feature map covered by the filter. Thus, the output after max-pooling layer would be a feature map containing the most prominent features of the previous feature map.

Average Pooling

Average pooling computes the average of the elements present in the region of feature map covered by the filter. Thus, while max pooling gives the most prominent feature in a particular patch of the feature map, average pooling gives the average of features present in a patch.

Global pooling reduces each channel in the feature map to a single value. Thus, an nh x nw x nc feature map is reduced to 1 x 1 x nc feature map. This is equivalent to using a filter of dimensions nh x nw i.e. the dimensions of the feature map.
Further, it can be either global max pooling or global average pooling.

MaxPooling is a down-sampling operation often used in Convolutional Neural Networks (CNNs) to reduce the spatial dimensions of the input volume. It is a form of pooling layer, and it helps in retaining the most important information while discarding less important details. MaxPooling is typically applied after convolutional layers in a CNN.

The basic idea behind MaxPooling is to divide the input image into non-overlapping rectangular regions and, for each region, output the maximum value. This operation is performed independently for each channel in the input.

Here’s a simple explanation of how MaxPooling works:

Input Region:

  • The input image is divided into small regions (usually 2x2 or 3x3).
  • For each region, the maximum value is computed.

Output Feature Map:

  • The maximum value for each region is taken and forms the output of that region.
  • The result is a down-sampled version of the input, with reduced spatial dimensions.

Mathematically, if we denote the input as X and the output as Y, the MaxPooling operation can be defined as:

Y[i,j,k]=max(X[2i:2i+2,2j:2j+2,k])

where i and j iterate over the height and width dimensions of the input, and k iterates over the channels.

Common choices for the size of the pooling window are 2x2 or 3x3, and the stride (the step size when moving the pooling window) is often set to be equal to the size of the window for non-overlapping pooling.

MAX POOLING (2)
import numpy as np
from keras.models import Sequential
from keras.layers import MaxPooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define model containing just a single max pooling layer
model = Sequential(
[MaxPooling2D(pool_size = 2, strides = 2)])

# generate pooled output
output = model.predict(image)

# print output image
output = np.squeeze(output)
print(output)

[[9. 7.]
[8. 6.]]

Let’s go through a simple example of MaxPooling with a 2x2 pooling window. Consider a small 4x4 input matrix:

MAX POOLING (3)

Now, let’s apply 2x2 MaxPooling to this input matrix. The pooling operation involves moving a 2x2 window across the input and, for each window, taking the maximum value. The output matrix, Y, will have reduced spatial dimensions.

Y[i,j]=max(X[2i:2i+2,2j:2j+2])

Let’s calculate Y step by step:

  1. For i=0 and j=0:

[0,0]=max(X[0:2,0:2])

=max([1 5​
3 6​]) =6
  1. For i=0 and j=1:

Y[0,1]=max(X[0:2,2:4])

=max([2 7​ 
4 8​])=8
  1. For i=1 and j=0:

Y[1,0]=max(X[2:4,0:2])

max([9 13​ 
10 14​])=14
  1. For i=1 and j=1:

Y[1,1]=max(X[2:4,2:4])

=max([11 15 
​12 16​])=16
The resulting output matrix Y is:

Y=[ 6 14
​8 16 ]

Max pooling offers several benefits in the context of CNNs:

  • Feature Invariance: Max pooling helps the model to become invariant to the location and orientation of features. This means that the network can recognize an object in an image no matter where it is located.
  • Dimensionality Reduction: By downsampling the input, max pooling significantly reduces the number of parameters and computations in the network, thus speeding up the learning process and reducing the risk of overfitting.
  • Noise Suppression: Max pooling helps to suppress noise in the input data. By taking the maximum value within the window, it emphasizes the presence of strong features and diminishes the weaker ones.

In practice, max pooling layers are placed after convolutional layers in a CNN. After a convolutional layer extracts features from the input image, the max pooling layer reduces the spatial size of the convolved feature map, keeping only the most salient information. This process is repeated for multiple convolutional and pooling layers, allowing the network to learn a hierarchy of features at various levels of abstraction.

Max pooling is a simple yet effective technique that has been instrumental in the success of CNNs in various applications, particularly in image and video recognition tasks. Its ability to reduce the computational burden while maintaining the essential features has made it a staple component in deep learning architectures.

Despite its benefits, max pooling is not without its challenges. One criticism is that it can sometimes be too aggressive, discarding potentially useful information that could be important for the classification task. Moreover, max pooling is a fixed operation and does not learn from the data, unlike convolutional layers that have learnable parameters.

As a result, some modern CNN architectures have started to move away from traditional max pooling layers, using alternatives like strided convolutions for downsampling or incorporating learnable pooling operations that can adapt to the data.

The link to the last article which contains the initial part of the article

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
layers.Conv2D(filters=64, kernel_size=3), # activation is None
layers.MaxPool2D(pool_size=2),
# More layers follow
])

A MaxPool2D layer is much like a Conv2Dlayer, except that it uses a simple maximum function instead of a kernel, with the pool_size parameter analogous to kernel_Size. A MaxPool2D layer doesn't have any trainable weights like a convolutional layer does in its kernel, however.

Let’s take another look at the extraction figure from the last lesson. Remember that MaxPool2D is the Condense step.

MAX POOLING (4)

Notice that after applying the ReLU function (Detect) the feature map ends up with a lot of “dead space,” that is, large areas containing only 0’s (the black areas in the image). Having to carry these 0 activations through the entire network would increase the size of the model without adding much useful information. Instead, we would like to condense the feature map to retain only the most useful part — the feature itself.

This in fact is what maximum pooling does. Max pooling takes a patch of activations in the original feature map and replaces them with the maximum activation in that patch.

MAX POOLING (5)

When applied after the ReLU activation, it has the effect of “intensifying” features. The pooling step increases the proportion of active pixels to zero pixels.

Translation Invariance

We called the zero-pixels “unimportant”. Does this mean they carry no information at all? In fact, the zero-pixels carry positional information. The blank space still positions the feature within the image. When MaxPool2D removes some of these pixels, it removes some of the positional information in the feature map. This gives a convnet a property called translation invariance. This means that a convnet with maximum pooling will tend not to distinguish features by their location in the image. ("Translation" is the mathematical word for changing the position of something without rotating it or changing its shape or size.)

Watch what happens when we repeatedly apply maximum pooling to the following feature map.

MAX POOLING (6)

The two dots in the original image became indistinguishable after repeated pooling. In other words, pooling destroyed some of their positional information. Since the network can no longer distinguish between them in the feature maps, it can’t distinguish them in the original image either: it has become invariant to that difference in position.

In fact, pooling only creates translation invariance in a network over small distances, as with the two dots in the image. Features that begin far apart will remain distinct after pooling; only some of the positional information was lost, but not all of it.

MAX POOLING (7)

This invariance to small differences in the positions of features is a nice property for an image classifier to have. Just because of differences in perspective or framing, the same kind of feature might be positioned in various parts of the original image, but we would still like for the classifier to recognize that they are the same.

Other Pooling Layers

MAX POOLING (8)
import numpy as np
from keras.models import Sequential
from keras.layers import AveragePooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define model containing just a single average pooling layer
model = Sequential(
[AveragePooling2D(pool_size = 2, strides = 2)])

# generate pooled output
output = model.predict(image)

# print output image
output = np.squeeze(output)
print(output)

[[4.25 4.25]
[4.25 3.5 ]]
import numpy as np
from keras.models import Sequential
from keras.layers import GlobalMaxPooling2D
from keras.layers import GlobalAveragePooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define gm_model containing just a single global-max pooling layer
gm_model = Sequential(
[GlobalMaxPooling2D()])

# define ga_model containing just a single global-average pooling layer
ga_model = Sequential(
[GlobalAveragePooling2D()])

# generate pooled output
gm_output = gm_model.predict(image)
ga_output = ga_model.predict(image)

# print output image
gm_output = np.squeeze(gm_output)
ga_output = np.squeeze(ga_output)
print("gm_output: ", gm_output)
print("ga_output: ", ga_output)

This ends the basic understanding of MaxPooling Layer in CNN architecture

MAX POOLING (2024)
Top Articles
Groww App Not Working on Android or iPhone? Here's How to Fix It
BitMEX | Bitcoin Mercantile Exchange
Netr Aerial Viewer
Craigslist Warren Michigan Free Stuff
Promotional Code For Spades Royale
Ups Dropoff Location Near Me
South Park Season 26 Kisscartoon
Ds Cuts Saugus
Martha's Vineyard Ferry Schedules 2024
Hawkeye 2021 123Movies
DL1678 (DAL1678) Delta Historial y rastreo de vuelos - FlightAware
Walgreens Alma School And Dynamite
Clafi Arab
Top Golf 3000 Clubs
Culver's Flavor Of The Day Monroe
Baseball-Reference Com
Sitcoms Online Message Board
The Binding of Isaac
Scholarships | New Mexico State University
735 Reeds Avenue 737 & 739 Reeds Ave., Red Bluff, CA 96080 - MLS# 20240686 | CENTURY 21
Inside the life of 17-year-old Charli D'Amelio, the most popular TikTok star in the world who now has her own TV show and clothing line
Swgoh Turn Meter Reduction Teams
Timeforce Choctaw
Downtown Dispensary Promo Code
Umn Biology
Kuttymovies. Com
Emuaid Max First Aid Ointment 2 Ounce Fake Review Analysis
Marlene2295
Noaa Marine Forecast Florida By Zone
Nacogdoches, Texas: Step Back in Time in Texas' Oldest Town
Weekly Math Review Q4 3
Black Adam Showtimes Near Amc Deptford 8
Crystal Mcbooty
Ross Dress For Less Hiring Near Me
Ferguson Showroom West Chester Pa
The Angel Next Door Spoils Me Rotten Gogoanime
All-New Webkinz FAQ | WKN: Webkinz Newz
Tinfoil Unable To Start Software 2022
Trending mods at Kenshi Nexus
Dragon Ball Super Card Game Announces Next Set: Realm Of The Gods
Hello – Cornerstone Chapel
Dying Light Mother's Day Roof
How to Find Mugshots: 11 Steps (with Pictures) - wikiHow
Suppress Spell Damage Poe
Gear Bicycle Sales Butler Pa
Zom 100 Mbti
Black Adam Showtimes Near Cinemark Texarkana 14
Intuitive Astrology with Molly McCord
Latest Posts
Article information

Author: Lakeisha Bayer VM

Last Updated:

Views: 5699

Rating: 4.9 / 5 (49 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Lakeisha Bayer VM

Birthday: 1997-10-17

Address: Suite 835 34136 Adrian Mountains, Floydton, UT 81036

Phone: +3571527672278

Job: Manufacturing Agent

Hobby: Skimboarding, Photography, Roller skating, Knife making, Paintball, Embroidery, Gunsmithing

Introduction: My name is Lakeisha Bayer VM, I am a brainy, kind, enchanting, healthy, lovely, clean, witty person who loves writing and wants to share my knowledge and understanding with you.