Troubleshooting

Troubleshooting#

  • Monitoring. In addition to tracking the training and validation losses, it is useful to monitor the number of correct triplets per batch on both training and validation sets. The percentage of correct triplets should increase as the loss decreases, indicating that the model is learning. If a high percentage of triplets are already correct, the model is not learning anything from them, and you should modify your triplets creation pipeline to create harder triplets.

  • Negative sampling. When creating triplets, make sure that the negative samples are plausible and preserve the statistics of the dataset. Maybe you just want to use a given % of hard triplets in your batches. This is highly dependent on your problem, but if your triplets are not realistic, you might end up training your network to distinguish between samples that you won’t find in any realistic scenario.

  • Setting the margin. If at the beginning of the training the loss decreases but the number of correct triplets does not increase, maybe you have set a too large of a margin. A good practice is to set the margin so that, when you start training some of the triplets are already correct by chance. A commonly used value is 0.1.

  • Distance computation. During retrieval at the evaluation stage, be sure to use the same distance function as the one used in the loss function during training. If the loss uses Euclidean distance, you should also use it during retrieval.