Object Localization

Object Localization#

Object detection is a core task in computer vision focused on both localizing and classifying objects within an image. In this project, you will start with a simplified setting: detecting whether an image contains a single object of a predefined category, and if so, predicting its bounding box. Next, you will extend this to images with multiple objects of the same category, requiring the model to detect and localize each instance. Finally, you will handle objects from multiple classes, completing the object detection pipeline. At each stage, you will assess how the increased complexity affects model design, training, and evaluation. This progression will give you hands-on experience with the challenges of building an object detection system.

Difficulty	Suggested Tutorials
Hard	Neural Networks, Convolutional Networks, Transfer Learning

Requirements: A GPU with at least 8GB of memory is recommended. Training on a CPU is possible but will be significantly slower.

Object Localization

Grading

The project will be graded based on the following criteria. Points for each activity are awarded based on quality and completeness (partial credit possible).

Activity	Points (max)
Presentation (clarity & demo)	3
~~~~~ Part 1: single object ~~~~~
Data pipeline	2
Model architecture	2
Loss function + Training	2
Performance evaluation	2
~~~~~ Part 2: multiple objects ~~~~~
Data pipeline	2
Model architecture	2
Loss function + Training	3
Performance evaluation	2
Total	0-20