Dataset#
The dataset has been manually curated using TinEye Multicolr Search, a tool that allows users to search for Creative Commons images on Flickr using up to five chosen colors. The dataset contains 16’632 images retrieved using a query based on a combination of 1–3 colors from a fixed color palette. Each image is labeled with the colors used in the query. The color palette is shown below.
Red | Orange | Yellow | Green |
Cyan | Blue | Violet | Pink |
White | Gray | Black |
Download the images#
The dataset is split into three parts due to the size of the images. Download the following files and unzip them into a folder of your choice.
You can use any unzip tool that supports multi-part archives, such as 7-Zip.
Load the dataset#
Assuming you have unzipped the files into a folder called .data
, you can load the dataset using the ImageFolder
class from the TorchVision library. The dataset is organized into subfolders, each containing images of a specific color. The subfolder names are the color names, and the images are in JPEG format.
from torchvision.datasets import ImageFolder
dataset = ImageFolder('.data/images/')
print("Number of images:", len(dataset))
print("Number of classes:", len(dataset.classes))
Number of images: 16632
Number of classes: 231
Let’s visualize some of the images.



