You may wonder why convolution layers are so useful when we include them into neural networks. There are essentially two advantages convolution layers have over fully connected layers;
1. Parameter Sharing:
Let's say we have 32x32x3 input image and convolve it with six 5x5 filters. This would results into an output of dimensions 28x28x6 . Here, the convolution operation literally involves 152 parameters, where each filter is contributing 25 filter values and 1 bias value.
On contrary, a fully connected layer will require around 14 million parameters to calculate 28x28x6 activation matrix from 32x32x3 input matrix.
The reason convolutional layer has fewer parameters is parameter sharing, meaning a feature detector which is useful in one part of image is probably also useful in another part of the image. This applies to both low-level features, like edges, and to high level features like eye of cat.
2. Sparsity of Connections:
In convolution layer of convolutional neural network (CNN), each output value depends on a small number of input values, known as sparsity of connections. Sparsity of connections inhibits overfitting during network training and keeps size of neural network significantly small at the same time not affecting baseline accuracy.