Data-centric AI
Search by



      Contrasting the Profiles of Easy and Hard Observations in a Dataset

      Published on
      Camila Castro Moreno, Pedro Yuri Arbs Paiva, Gustavo H. Nunes, Ana Carolina Lorena

      For supporting data-centric analyzes, it is important to identify and characterize which observations from a dataset are hard or easy to classify. This paper employs meta-learning strategies to describe the main differences between observations which are easy and hard to classify in a dataset. Intervals on significant meta-features values assessing the hardness levels of the observations are extracted and contrasted. This meta-knowledge allows for characterizing the hardness profile of a dataset and obtaining insights into the main sources of difficulty they pose, as shown in experiments using two super-classes of the CIFAR-100 dataset with different hardness levels.

      This video is from the NeurIPS 2021 Data-centric AI workshop proceedings.

      Join the Data-centric AI Movement

      We want to share your Data-centric AI story. Fill out this Google form so we can feature your work!



      © 2022 Data-centric AI