Choice 1: Density Map

Choice 1: Density Map#

In this project, you will build a convolutional neural network that estimates the number of individuals in a crowded scene by analyzing the spatial density of people in an image. Your implementation will follow the approach described in the paper “CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes” by Li et al. (2018). CSRNet extends a pretrained VGG16 backbone (frontend) with dilated convolutional layers (backend), enabling the model to capture larger contextual regions without losing spatial resolution. The model is trained to generate a density map, where the integral over its values corresponds to the estimated count of individuals in the input image.

Difficulty

Suggested Tutorials

Easy

Convolutional Networks, Transfer Learning

Requirements: A GPU with at least 8GB of memory is recommended. Training on a CPU is possible but will be significantly slower.

Density-Based Counting


Grading

The project will be graded based on the following criteria. Points for each activity are awarded based on quality and completeness (partial credit possible).

Activity

Points (max)

Preprocessing pipeline

4

CSRNet architecture

4

Loss function

2

Training (with frozen frontend)

2

Fine-tuning the entire network

2

Performance evaluation

3

Presentation (clarity & demo)

3

Total

0-20