Contrasting the Profiles of Easy and Hard Observations in a Dataset

Published 14 Feb 2022

Published on14 Feb 2022

Camila Castro Moreno, Pedro Yuri Arbs Paiva, Gustavo H. Nunes, Ana Carolina Lorena

For supporting data-centric analyzes, it is important to identify and characterize which observations from a dataset are hard or easy to classify. This paper employs meta-learning strategies to describe the main differences between observations which are easy and hard to classify in a dataset. Intervals on significant meta-features values assessing the hardness levels of the observations are extracted and contrasted. This meta-knowledge allows for characterizing the hardness profile of a dataset and obtaining insights into the main sources of difficulty they pose, as shown in experiments using two super-classes of the CIFAR-100 dataset with different hardness levels.

Posts

Topics

Contrasting the Profiles of Easy and Hard Observations in a Dataset

Camila Castro Moreno, Pedro Yuri Arbs Paiva, Gustavo H. Nunes, Ana Carolina Lorena

This video is from the NeurIPS 2021 Data-centric AI workshop proceedings.

Join the Data-centric AI Movement

Share