Image Classification:
In image classification, a supervised machine learning algorithm looks at the input image and predicts a class label of that image (binary or multi-class).
For example, a binary dog classifier algorithm will predict dog or no-dog class label for a given input image.
Image Classification with Localization:
In image classification with localization, the supervised algorithm would predict class as well as the bounding box around the object in the image. The term 'localization' refers to where the object is in the image.
Let's say we are talking about the classification of vehicles with localization. In this case, the algorithm will predict a) the class of vehicles, and b) coordinates of the bounding box around the vehicle object in the image.
Object Detection:
In the object detection problem, we have multiple objects in the image, and we expect the algorithm to detect all of them and put a bounding box around each of them.
For instance, if we are doing object detection for home monitoring applications, we need to detect and localize tables, chairs, laptops, cups, etc.
Classification and classification with localization usually have one object in the image. However, in object detection problem, the input image may have multiple objects of different types. These three problems are closely related to each other as we move from classification to classification with localization and object detection.
In computer vision, object detection has made massive progress over the last decade, especially in applications such as face recognition, video surveillance, vehicle detection, online image classification, object detection in the manufacturing industry, etc.