The density map estimator is then trained as a standard pixel-wise regression problem using L2 loss. In contrast to pixel-wise L2 loss, Bayesian loss (BL) generates an aggregated dot prediction from the density map prediction, and uses a point-wise loss function between the ground-truth dot annotations and the aggregated dot prediction.