Highly Efficient Representation and Active Learning Framework and Its Application to Imbalanced Medical Image Classification

Published 14 Feb 2022

Published on14 Feb 2022

Heng Hao, Hankyu Moon, Sima Didari, Jae Oh Woo, Patrick Bangert

We propose a highly data-efficient active learning framework for image classification. Our novel framework combines: (1) unsupervised representation learning of a Convolutional Neural Network and (2) the Gaussian Process (GP) method, in sequence to achieve highly data and label efficient classifications. Moreover, both elements are less sensitive to the prevalent and challenging class imbalance issue, thanks to the (1) feature learned without labels and (2) the Bayesian nature of GP. The GP-provided uncertainty estimates enable active learning by ranking samples based on the uncertainty and selectively labeling samples showing higher uncertainty. We apply this novel combination to the severely imbalanced case of COVID-19 chest X-ray classification and the Nerthus colonoscopy classification. We demonstrate that only ≤ 10% of the labeled data is needed to reach the accuracy from training all available labels. We also applied our model architecture and proposed framework to a broader class of datasets with expected success.

Posts

Topics

Highly Efficient Representation and Active Learning Framework and Its Application to Imbalanced Medical Image Classification

Heng Hao, Hankyu Moon, Sima Didari, Jae Oh Woo, Patrick Bangert

This video is from the NeurIPS 2021 Data-centric AI workshop proceedings.

Join the Data-centric AI Movement

Share