What is CNN?

Convolutional Neural Network, also known as CNN is a sub field of deep learning which is mostly used for analysis of visual imagery. CNN is a class of deep feedforward artificial neural networks (ANN). This Neural Network uses the already supplied dataset to it for training purposes, and predicts the possible future labels to be assigned. Any kind of data This Neural Network uses its strengths against the curse of dimensionality. A portion of the territories where CNNs are broadly utilized are image recognition, image classification, image captioning and object detection etc. The CNNs got immense popularity when Alex discovered it in 2012. In just three years, the engineers have advanced it to an extent that an older 8 layer AlexNet now is converted into 152 layer ResNet. Tasks where recommendation systems, contextual importance or natural language processing (NLP) is considered, CNNs come handy. The key chore of the neural network is to make sure it processes all the layers, and hence detects all the underlying features, automatically. A CNN is a convolution tool that parts the different highlights of the picture for analysis and prediction.

cnn basics

The main strengths of CNNs are to provide an efficient dense network which performs the prediction or identification etc. efficiently. CNNs are the most popular topic in the pool of deep learning, which is indeed very vast, and this is usually because of the ConvNets. Immense datasets are applied to CNNs, it is even considered that larger the data, greater the accuracy will result, otherwise other operations such as transfer learning shall be applied to expand the data.

The power of CNN is to detect distinct features from images all by itself, without any actual human intervention. The most popular dataset that CNN picks the features from are the Cats and Dogs dataset where each feature is picked automatically and the pictures are classified as dogs or cats.

Curse of Dimensionality

As the images are the most popular input sets provided to these models, they are presented in the vector format and thus have dimensions. As the complexity of the data increases, the number of features corresponding to each image increases which in turn increases the number of weights to be trained. And when these weights are actually multiplied the dimensions get so lost that the ends are never met. To solve these kinds of problems, things like gradient descent were brought into limelight. 


Factors affecting efficiency

  • Filters

The efficiency of the CNN model may be adjusted by some assets such as filters. Filters are one of the key assets used in the development of the CNNs. Each layer of the CNN is applied to the filters to be mapped onto the images for result declaration. The values of these filters are not fixed, and are according to how we train our model. These are mainly trained on labeled images and improve their efficiency with training. These filters are the ones which detect the key unique features from the images and then these features are multiplied with random weights specified. We create many feature maps to obtain our first convolutional layer. 

Other than the filters, there are four different types of CNN hyper parameters such as kernel size, stride and padding. All these factors affect the efficiency of the CNN model.

  • Kernel Size

Kernel size is the first thing which affects the efficiency and performance of the working of a CNN model. We can calculate kernel size by the formula

 Kernel size = n _ inputs * n _ outputs

The kernel size of convolutional layer can be determined by k _ w * k _ h * c _ in * c _ out where c_out is the size of the bias.

  • Stride

The second most important asset to building an efficient CNN is stride. Step is the number of pixels shifting over the information network. It is the distance to move, filter, and move faster with larger values. Stride can have different values but the most common one is stride = 1. Basically adding a stride to the model decreases the overall size of the feature map and reduces the bulkiness of information being passed to the next layer. 

  • Padding

Sometimes the input image is of lesser pixel size than one being the output which results in lowering the accuracy and erupts the efficiency. Thus, a number of pixels are added to the input image, before it gets filtered and processed so that the number of input and output image pixels is similar. Padding has a value and that particular value is added to the input image on each end. Because the kernel scans the whole image to process it for computation, padding adds a little extra frame on the sides of the image to grant an extra room to the kernel.

CNN Architecture (5 Layers)

The CNN architecture consists of several kinds of layers; Convolutional layer, pooling layer, fully connected input layer, fully connected layer and fully connected output layer.

  • Convolutional layer: Convolutional layer is the backbone of any CNN working model. This layer is the one where pixel by pixel scanning takes place of the images and creates a feature map to define future classifications. 
  • Pooling layer: Pooling is also known as the down-sampling of the data by bringing the overall dimensions of the images. The information of each feature from each convolutional layer is limited down to only containing the most necessary data. The process of creating convolutional layers and applying pooling is continuous, may take several times.
  • Fully connected input layer: This is also known as the flattening of the images. The outputs gained from the last layer are flattened into a single vector so that it can be used as the input data from the upcoming layer. 
  • Fully connected layer: After the feature analysis has been done and it’s time for computation, this layer assigns random weights to the inputs and predicts a suitable label.
  • Fully connected Output layer: This is the final layer of the CNN model which contains the results of the labels determined for the classification and assigns a class to the images. 

Uses of CNN

There are multiple benefits of using this model as the state of art neural network. As it can be used in various fields and perform major tasks like facial recognition, analyzing documents, understanding climate, and image recognition and object identification etc. Deep learning has helped enormously in advancement of the science fields and CNN is the most popular one as it attains the benefits of providing maximum performance and efficiency.