Crowd Counting

Crowd Counting#

Crowd counting is a computer vision task that aims to estimate how many people appear in an image and how they are spatially distributed. Unlike standard object detection, it must handle highly congested scenes where individuals overlap or are too small to be distinctly recognized. This makes crowd counting particularly challenging, as traditional detection methods often fail in such scenarios.

The goal of this project is to implement a deep learning–based crowd counting system using PyTorch. You will choose one of the following models to reproduce.

Choice 1: CSRNet. A network that estimates a density map summing to the crowd count.
Choice 2: P2PNet. A network that directly predicts head locations.

These models illustrate two popular approaches for crowd counting. CSRNet formulates the task as continuous density estimation, where each pixel contributes proportionally to the total count. P2PNet instead treats it as a localization problem, predicting individual positions directly. The individual project descriptions outline the objectives, implementation steps, and expected outcomes for each approach.

CSRNet vs P2PNet