A statistical theory of cold-posteriors, semi-supervised learning and out-of-distribution detection
- Date: Jun 21, 2021
- Time: 03:00 PM - 04:00 PM (Local Time Germany)
- Speaker: Laurence Aitchison
- Senior Lecturer in Machine Learning and Computational Neuroscience, University of Bristol, UK
- Location: Zoom
Image classification datasets such as CIFAR-10 and ImageNet are carefully curated to exclude ambiguous or difficult to classify images. Remarkably, this curation process can be used to understand three very different areas in deep learning: semi-supervised learning, out-of-distribution detection and the cold posterior effect. We develop a generative model of dataset curation in which multiple annotators label every image, with the image being included in the dataset only if all the annotators agree. If any of the annotators disagree, the image is excluded. This directly explains the "cold posterior effect", where artificially reducing uncertainty in the Bayesian posterior over neural network weights gives better test performance (ICLR 2021; arxiv.org/abs/2008.05912). In addition, if we marginalise over the class-label, we get a semi-supervised learning objective mirroring entropy minimization and pseudo-labelling, which allows us to use unlabelled points to improve the performance of a classifier (very early version: arxiv.org/abs/2008.05913). Finally the curation process itself gives us insight into out-of-distribution detection, where we explicitly detect test points that are far from the training data, as our predictions might be inaccurate in those regions (arxiv.org/abs/2102.12959).