Object Localization

Object Localization#

Object localization is a core problem in computer vision that extends beyond classification by not only detecting the presence of an object, but also predicting its spatial location with a bounding box. In this project, you will start with a simplified setting: detecting whether an image contains a single object of a predefined category, and if so, predicting its bounding box coordinates. Then, you will deal with multiple objects per image, requiring the model to identify and localize each instance separately. Finally, you will generalize further to handle multiple object categories, completing the transition to object detection. At each stage, you will assess how the increased complexity affects model design, training, and evaluation. This progression will give you hands-on experience with the challenges of building object detection systems.

Difficulty

Suggested Tutorials

Medium

Neural Networks, Convolutional Networks, Transfer Learning

Object Localization


Grading

The project will be graded based on the following criteria. Points for each activity are awarded based on quality and completeness (partial credit possible).

Activity

Points (max)

Build and evaluate a model for single-class, single-object localization

10

Extend the model to handle multiple objects per image

5

Extend the model to handle multiple categories

2

Presentation (clarity & demo)

3

Total

0-20