Camila Castro Moreno, Pedro Yuri Arbs Paiva, Gustavo H. Nunes, Ana Carolina Lorena
For supporting data-centric analyzes, it is important to identify and characterize which observations from a dataset are hard or easy to classify. This paper employs meta-learning strategies to describe the main differences between observations which are easy and hard to classify in a dataset. Intervals on significant meta-features values assessing the hardness levels of the observations are extracted and contrasted. This meta-knowledge allows for characterizing the hardness profile of a dataset and obtaining insights into the main sources of difficulty they pose, as shown in experiments using two super-classes of the CIFAR-100 dataset with different hardness levels.