NeurIPS Data-Centric AI Workshop

Call for Papers

The ML community has a strong track record of building and using datasets for AI systems. But this endeavor is often artisanal—painstaking and expensive. The community lacks high productivity and efficient open data engineering tools to make building, maintaining and evaluating datasets easier, cheaper and more repeatable. So, the core challenge is to accelerate dataset creation and iteration together with increasing the efficiency of use and reuse by democratizing data engineering and evaluation.

If 80 percent of machine learning work is data preparation, then ensuring data quality is the most important work of a machine learning team and therefore a vital research area. Human-labeled data has increasingly become the fuel and compass of AI-based software systems - yet innovative efforts have mostly focused on models and code. The growing focus on scale, speed, and cost of building and improving datasets has resulted in an impact on quality, which is nebulous and often circularly defined, since the annotators are the source of data and ground truth [Riezler, 2014]. The development of tools to make repeatable and systematic adjustments to datasets has also lagged. While dataset quality is still the top concern everyone has, the ways in which that is measured in practice is poorly understood and sometimes simply wrong. A decade later, we see some cause for concern: fairness and bias issues in labeled datasets [Goel and Faltings, 2019], quality issues in datasets [Crawford and Paglen, 2019], limitations of benchmarks [Kovaleva et al., 2019, Welty et al., 2019] reproducibility concerns in machine learning research [Pineau et al., 2018, Gunderson and Kjensmo, 2018], lack of documentation and replication of data [Katsuno et al., 2019], and unrealistic performance metrics [Bernstein 2021].

We need a framework for excellence in data engineering that does not yet exist. In the first to market rush with data, aspects of maintainability, reproducibility, reliability, validity, and fidelity of datasets are often overlooked. We want to turn this way of thinking on its head and highlight examples, case-studies, methodologies for excellence in data collection. Building an active research community focused on Data Centric AI is an important part of the process of defining the core problems and creating ways to measure progress in machine learning through data quality tasks.

Submission Instructions

We welcome short papers (1-2 pages) and long papers (4 pages) addressing one or more of the topics of interest below. All papers need to be formatted according to the NeurIPS2021 Formatting Instructions. Papers will be peer-reviewed by the program committee and accepted papers will be presented as lightning talks during the workshop. If you have any questions about submission, please first check the FAQ link below. Contact us per email only if your question is not answered in the FAQ below, or if you experience any problems with the submission site, please email us at (neurips-data-centric-ai@googlegroups.com)

Submission FAQ Submission Link

Topics of Interest

Data Centric AI workshop is inviting position papers from researchers and practitioners on topics that include but not limited to the following:

New Datasets in areas:

Speech, vision, manufacturing, medical, recommendation/personalization
Science: https://www.mgi.gov/

Tools & methodologies for accelerating open-source dataset iteration:

Tools that quantify and accelerate time to source and prepare high quality data
Tools that ensure that the data is labeled consistently, such as label consensus
Tools that make improving data quality more systematic
Tools that automate the creation of high quality supervised learning training data from low quality resources, such as forced alignment in speech recognition
Tools that produce consistent and low noise data samples, or remove labeling noise or inconsistencies from existing data
Tools for controlling what goes into the dataset and for making high level edits efficiently to very large datasets, e.g. adding new words, languages, or accents to speech datasets with thousands of hours
Search methods for finding suitably licensed datasets based on public resources
Tools for creating training datasets for small data problems, or for rare classes in the long tail of big data problems
Tools for timely incorporation of feedback from production systems into datasets
Tools for understanding dataset coverage of important classes, and editing them to cover newly identified important cases
Dataset importers that allow easy combination and composition of existing datasets
Dataset exporters that make the data consumable for models and interface with model training and inference systems such as webdataset.
System architectures and interfaces that enable composition of dataset tools such as, MLCube, Docker, Airflow

Algorithms for working with limited labeled data and improving label efficiency:

Data selection techniques such as active learning and core-set selection for identifying the most valuable examples to label.
Semi-supervised learning, few-shot learning, and weak supervision methods for maximizing the power of limited labeled data.
Transfer learning and self-supervised learning approaches for developing powerful representations that can be used for many downstream tasks with limited labeled data.
Novelty and drift detection to identify when more data needs to be labeled.

Responsible AI development :

Fairness, bias, diversity evaluation and analysis for data sets and modeling/algorithms
Tools for green AI hardware-software system design and evaluation
Scalable, reliable training methods and systems
Tools, methodologies, and techniques for private, secure machine learning training
Efforts toward reproducible AI, such as data cards, model cards

PST	EST	UTC	Agenda
8:30 AM	11:30 AM	4:30 PM	Andrew Ng - Opening Remarks
8:45 AM	11:45 AM	4:45 PM	Lora Aroyo - Workshop Overview
9:00 AM	12:00 PM	5:00 PM	Keynote: Michael Bernstein - HCI and Crowdsourcing for DCAI
9:15 AM	12:15 PM	5:15 PM	Invited Talk: Past/Future of data centric AI with Olga Russakovsky
9:25 AM	12:25 PM	5:25 PM	Lightning Talks: Benchmarking
10:25 AM	1:25 PM	6:25 PM	Invited Talk: Peter Mattson - DataPerf - Benchmarking Data Centric AI
10:40 AM	1:40 PM	6:40 PM	Lightning Talks: Theory and Challenge Problems in Data Centric AI
11:20 AM	2:20 PM	7:20 PM	Invited Talk: Douwe Kiela - FAIR Dynabench
11:30 AM	2:30 PM	7:30 PM	Lightning Talks: Responsibility and Ethics
12:10 PM	3:10 PM	8:10 PM	Q&A Panel with Morning Speakers
12:50 PM	3:50 PM	8:50 PM	Break to watch video recordings

PST	EST	UTC	Agenda
1:20 PM	4:20 PM	9:20 PM	Keynote: Alex Ratner & Chris Ré - The Future of Data Centric AI
1:35 PM	4:35 PM	9:35 PM	Invited Talk: D Sculley - Data Debt
1:45 PM	4:45 PM	9:45 PM	Lightning Talks: Datasets and Data Synthesis
2:45 PM	5:45 PM	10:45 PM	Invited Talk: Curtis Northcutt
2:55 PM	5:55 PM	10:55 PM	Lightning Talks: Data Quality and Iteration
3:40 PM	6:40 PM	11:40 PM	Invited Talk: Anima Anandkumar
3:50 PM	6:50 PM	11:50 PM	Lightning Talks: Data Labeling
4:30 PM	7:30 PM	12:30 AM	Q&A Panel session with afternoon speakers
5:10 PM	8:10 PM	1:10 AM	Break to watch video recordings/td>

Accepted Papers

Title	Authors (* corresponding)	Link
A Hybrid Bayesian Model to Analyse Healthcare Data	Pourshahrokhi, Narges*; Kouchaki, Samaneh; Kober, Kord; Miaskowski, Christine ; Barnaghi, Payam	Link
How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus	Park, Chanjun*; Lee, Seolhwa; Moon, Hyeonseok; Eo, Sugyeong; Seo, Jaehyung; Lim, Heuiseok	Link
A New Tool for Efficiently Generating Quality Estimation Datasets	Eo, Sugyeong; Park, Chanjun*; Seo, Jaehyung; Moon, Hyeonseok; Lim, Heuiseok	Link
Automatic Knowledge Augmentation for Generative Commonsense Reasoning	Seo, Jaehyung*; Park, Chanjun; Eo, Sugyeong; Moon, Hyeonseok; Lim, Heuiseok	Link
Tabular Engineering with Automunge	Teague, Nicholas*	Link
A Probabilistic Framework for Knowledge GraphData Augmentation	Chauhan, Jatin*; Gupta, Priyanshu; Minervini, Pasquale	Link
FedHist: A Federated-First Dataset for Learning inHealthcare	Khan, Usmann*
A First Look Towards One-Shot Object Detection with SPOT for Data-Efficient Learning	Chakraborty, Ria*; Popli, Madhur; Lamba, Rachit; Verma, Rishi	Link
YMIR: A Rapid Data-centric Development Platform for Vision Applications	Huang, Phoenix X.; Hu, Wenze*; Brendel, William; Chandraker, Manmohan; Li, Li-Jia; Wang, Xiaoyu	Link
Towards better data discovery and collection with flow-based programming	Paleyes, Andrei*; Cabrera, Christian; Lawrence, Neil D	Link
CircleNLU: A Tool for building Data-Driven Natural Language Understanding System	Hoang, Vu*	Link
Using Synthetic Images To Uncover Population Biases In Facial Landmarks Detection	Shadmi, Ran*; Laserson, Jonathan; Elbaz, Gil
Challenges of Working with Materials R&D Data	Kubie, Lenore*; Kroenlein, Kenneth
PyHard: a novel tool for generating hardness embeddings to support data-centric analysis	Paiva, Pedro Yuri Arbs*; Smith-Miles, Kate; Valeriano, Maria; Lorena, Ana	Link
AirSAS: Controlled Dataset Generation for Physics-Informed Machine Learning	Cowen, Benjamin*; Park, J. Daniel; Blanford, Thomas E.; Goehle, Geoff; Brown, Daniel C.	Link
Open-Sourcing Generative Models for Data-driven Robot Simulations	Bamani, Eran*; Sintov, Avishai; Azulay, Osher; Gurevich, Anton
Few-Shot Image Classification Challenge On-Board OPS-SAT	Derksen, Dawa*; Meoni, Gabriele; Lecuyer, Gurvan; Mergy, Anne; Maertens, Marcus; Izzo, Dario	Link
Dialectal Voice : An Open-Source Voice Dataset and Automatic Speech Recognition model for Moroccan Arabic dialectal	Allak, Anass*; Naira, Abdou Mohamed; Imade, Benelallam; Kamel, Gaanoun	Link
DAG Card is the new Model Card	Tagliabue, Jacopo*; Tuulos, Ville; Greco, Ciro; Dave, Valay	Link
SCIMAT: Science and Mathematics Dataset	Kollepara, Neeraj; Chatakonda, Snehith K; kumar, pawan*	Link
Towards Systematic Evaluation in Machine Learning through Automated Stress Test Creation	Madras, David*; Zemel, Richard
Annotation Quality Framework - Accuracy,Credibility, and Consistency	Lavitas, Liliya*; Lee, Allen; Redfield, Olivia; Fletcher, Daniel; Eck, Matthias; Janardhanan, Sunil	Link
Ontolabeling: Re-Thinking Data Labeling For Computer Vision	Croce, Nicola*; Nieto, Marcos	Link
Natural Adversarial Objects	Lau, Felix*; Harrison, Sasha; Subramani, Nishant; Kim, Aerin; Branson, Elliot R; Liu, Rosanne
No News is Good News: A Critique of the One Billion Word Benchmark	Ngo, Helen*; Frosst, Nicholas; Madeira Araújo, João G; Hui, Jeff
A Data-Centric Approach for Training Deep Neural Networks with Less Data	Motamedi, Mohammad*; Sakharnykh, Nikolay; Kaldewey, Tim	Link
Finding Label Errors in Autonomous Vehicle Data With Learned Observation Assertions	Kang, Daniel*; Arechiga, Nikos; Pillai, Sudeep; Bailis, Peter D; Zaharia, Matei	Link
Single-Click 3D Object Annotation on LiDAR Point Clouds	Nguyen, Trung Duc*; Hua, Binh-Son; Nguyen, Thanh; Phung, Dinh	Link
Decreasing Annotation Burden of Pairwise Comparisons with Human-in-the-Loop Sorting: Application in Medical Image Artifact Rating	Jang, Ikbeom*; Danley, Garrison; Chang, Ken; Kalpathy-Cramer, Jayashree	Link
Effect of Radiology Report Labeler Quality on Deep Learning Models for Chest X-Ray Interpretation	Jain, Saahil*; Smit, Akshay; Ng, Andrew; Rajpurkar, Pranav	Link
A Data-Centric Image Classification Benchmark	Schmarje, Lars*; Liao, Yuan-Hong; Koch, Reinhard	Link
Diagnosing severity levels of Autism Spectrum Disorder with Machine Learning	Cinque, Marcello; Moscato, Vincenzo; Postiglione, Marco*; Riccio, Maria Pia	Link
Sampling To Improve Predictions For Underrepresented Observations In Imbalanced Data	Kjærsgaard, Rune D.*; Grønberg, Manja; Clemmensen, Line	Link
Automatic Data Quality Evaluation for Text Classification	li, jiazheng*	Link
Building Legal Datasets	Soh, Jerrold*	Link
Comparing Data Augmentation and Annotation Standardization to Improve End-to-end Spoken Language Understanding Models	Nicolich-Henkin, Leah*; Nakatani, Taichi; Trozenski, Zach; Whiteman, Joel; Susanj, Nathan	Link
DiagnosisQA: A semi-automated pipeline for developing clinician validated diagnosis specific QA datasets.	Mishra, Shreya; Awasthi, Raghav; Papay, Frankie; Maheshwari, Kamal; Cywinski, Jacek; Khanna, Ashish; Mathur, Piyush *	Link
Influence of human-expert labels on a neonatal seizure detector based on a convolutional neural network	Borovac, Ana*; Runarsson, Thomas P; Guðmundsson, Steinn; Thorvardsson, Gardar	Link
Feminist Curation of Text for Data-centric AI	Bartl, Marion*; Leavy, Susan	Link
Challenges and Solutions to build a Data Pipeline to Identify Anomalies in Enterprise System Performance	Huang, Xiaobo*; Banerjee, Amitabha; Chen, Chien-Chia; Huang, Chengzhi; Chuang, Tzu Yi; Srivastava, Abhishek; Cheveresan, Razvan
Human-inspired Data Centric Computer Vision	Tsutsui, Satoshi*; Crandall, David; Yu, Chen	Link
Utilizing Driving Context to Increase the Annotation Efficiency of Imbalanced Gaze Image Data	Rehm, Johannes*; Gundersen, Odd Erik; Bach, Kerstin; Reshodko, Irina	Link
Unleashing the Power of Industrial Big Data through Scalable Manual Labeling	Paes Leao, Bruno*; Fradkin, Dmitriy; Lan, Tu; Wang, Jianhui	Link
nferX: a case study on data-centric NLP in biomedicine	Chang, David*; Mathew, Vineet; Kogler, Lorenzo; Jin, Roger; Rao, Krishna; Raghunathan, Bharathwaj; Ip, Wui; Doctor, Zainab; Pawlowski, Colin; Rajesekharan, Ajit	Link
On Data-centric Myths	Marcu, Antonia*; Prugel-Bennett, Adam	Link
All in one Data Cleansing Tool	Sairaman, Sri Aravind*; Vailoppilly, Arun Prasad ; Sakthivel, Ramkumar; Kumar, Resham Sundar; BDSV, Vignesh; G, Aravind	Link
Contrasting the Profiles of Easy and Hard Observations in a Dataset	Moreno, Camila C*; Paiva, Pedro; Nunes, Gustavo; Lorena, Ana	Link
A concept for fitness-for-use evaluation in Machine Learning pipelines	Jonietz, David*	Link
Vietnamese Speech-based Question Answering over Car Manuals	Vo, Tin Duy*; Luong, Manh; Minh Le, Duong; Tran, Hieu Minh; Do, Nhan; Nguyen, Duy; Nguyen, Thien; Bui, Hung; Nguyen, Dat Quoc; Phung, Dinh
Self-supervised Semi-supervised Learning for Data Labeling and Quality Evaluation	Bai, Haoping*; Cao, Meng; Huang, Ping; Shan, Jiulong	Link
Towards a Taxonomy of Graph Learning Datasets	Liu, Renming; Cantürk, Semih; Wenkel, Frederik; Sandfelder, Dylan; Kreuzer, Devin; Little, Anna; McGuire, Sarah; Perlmutter, Michael; O'Bray, Leslie; Rieck, Bastian; Hirn, Matthew; Wolf, Guy; Rampášek, Ladislav*	Link
Addressing Content Selection Bias in Creating Datasets for Hate Speech Detection	Rahman, Md Mustafizur; Balakrishnan, Dinesh; Murthy, Dhiraj; Kutlu, Mucahid; Lease, Matthew*	Link
Lhotse: a speech data representation library for the modern deep learning ecosystem	Żelasko, Piotr*; Daniel Povey; Jan Trmal; Sanjeev Khudanpur	Link
Bridging the gap between AI and the life sciences: towards a standardized multi-omics data type	Herbsthofer, Laurin; Oberhuber, Monika; Prietl, Barbara; López García, Pablo*	Link
Increasing Data Diversity with Iterative Sampling to Improve Performance	Çavuşoğlu, Devrim*; Eryüksel, Oğulcan; Altınuç, Sinan O	Link
Data preparation for training CNNs: Application to vibration-based condition monitoring	Yaghoubi, Vahid*; Cheng, Liangliang; Van Paepegem, Wim; Kersemans, Mathias	Link
Bridging the gap to real-world for network intrusion detection systems with data-centric approach	de Carvalho Bertoli, Gustavo*; Alves Pereira Jr, Lourenço; Verri, Filipe; Santos, Aldri; Saotome, Osamu	Link
Highly Efficient Representation and Active Learning Framework and Its Application to Imbalanced Medical Image Classification	Hao, Heng*; Moon, Hankyu; Didari, Sima; Woo, Jae Oh; Bangert, Patrick	Link
Evaluating Machine Learning Models for Internet Network Security with Data Slices	Toman, Pamela*; Yadgaran, Elisha; Papadimitriou, Christina; Isaksen, Aaron; Kraning, Matt	Link
AutoDQ: Automatic Data Quality for Financial Data	Villarreal-Vasquez, Miguel*; Buford, John; Dhingra, Prashant; Yin, Fenglin
Data Cards: Purposeful and Transparent Documentation for Responsible AI	Pushkarna, Mahima*; Zaldivar, Andrew	Link
3D ImageNet: A data collection and labeling tool for Depth and RGB Images	Singh, Gurjeet*; Patrick, Chiang; Zhou, Sifan; Qian, James	Link
Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution	Yin, Wenpeng*; Heinecke, Shelby; Li, Vena; Keskar, Nitish Shirish; Jones, Michael; Shi, Shouzhong; Georgiev, Stanislav; Milich, Kurt; Esposito, Joseph; Xiong, Caiming	Link
IMDB-WIKI-SbS: An Evaluation Dataset for Crowdsourced Pairwise Comparisons	Pavlichenko, Nikita; Ustalov, Dmitry*	Link
Exploiting Proximity Search and Easy Examples to Select Rare Events	Kang, Daniel*; Derhacobian, Alex; Tsuji, Kaoru; Hebert, Trevor; Bailis, Peter D; Fukami, Tadashi; Hashimoto, Tatsunori; Sun, Yi; Zaharia, Matei	Link
Fantastic Data and How to Query Them	Tran, Trung-Kien*; Le-Tuan, Anh; Nguyen Duc, Manh; Yuan, Jicheng; Le Phuoc, Danh	Link
Whose Ground Truth? Accounting for Individual and Collective Identities Underlying Dataset Annotation	Denton, Emily*; Diaz, Mark; Kivlichan, Ian D; Prabhakaran, Vinodkumar; Rosen, Rachel	Link
Two Approaches to Building Dialogue Systems for People on the Spectrum	Firsanova, Victoria*	Link
What can Data-Centric AI Learn from Data and ML Engineering?	Polyzotis, Alkis*; Zaharia, Matei
Ground-Truth, Whose Truth? - Examining the Challenges with Annotating Toxic Text Datasets	Arhin, Kofi*; Baldini, Ioana; Wei, Dennis; Natesan Ramamurthy, Karthikeyan ; Singh, Moninder	Link
Towards a Shared Rubric for Dataset Annotation	Greene, Andrew M*	Link
LSH methods for data deduplication in a Wikipedia artificial dataset	Ciro, Juan Manuel; Galvez, Daniel; Schlippe, Tim ; Kanter, David	Link
Annotation Inconsistency and Entity Bias inMultiWOZ	Qian, Kun*; Beirami, Ahmad; Lin, Zhouhan; De, Ankita; Geramifard, Alborz; YU, Zhou; Sankar, Chinnadhurai
Seg-Diff: Checkpoints Are All You Need	Brewster, Grant*; Yuan, Bodi; Hooker, Sara; Cao, Chen; Yuan, Zhiqiang
AutoDC: Automated data-centric processing	Liu, Zac Yung-Chun*; Roychowdhury, Shoumik; Tarlow, Scott; Nair, Akash; Badhe, Shweta; Shah, Tejas	Link
Engineering AI Tools for Systematic and Scalable Quality Assessment in Magnetic Resonance Imaging	Zou, Yukai; Jang, Ikbeom*	Link
FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance	Liu, Xiao-Yang*; Rui, Jingyang; Gao, Jiechao; Yang, Liuqing; Yang, Hongyang; Wang, Zhaoran; Wang, Christina Dan ; Guo, Jian	Link
Data Augmentation for Intent Classification	Chen, Derek*; Yin, Claire	Link
InfiniteForm: A synthetic, minimal bias dataset for fitness applications	Weitz, Andrew*; Bent, Brinnae; Colucci, Lina; Primas, Sidney	Link
Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing	Mishra, Abhilash*; Gorana, Yash	Link
Data-Centric AI Requires Rethinking Data Notion	Hajij, Mustafa*; Zamzmi, Ghada; Natesan Ramamurthy, Karthikeyan ; Guzman Saenz, Aldo	Link
Exploiting Domain Knowledge for Efficient Data-centric Session-based Recommendation model	Mishra, Mayank*; Singhal, Rekha	Link
Topological Deep Learning	Hajij, Mustafa*; Natesan Ramamurthy, Karthikeyan ; Guzman Saenz, Aldo; Istvan, Kyle	Link
Fix your Model by Fixing your Datasets	Sanyal, Atindriyo*; Vyas, Nidhi Kaushik; Chatterji, Vikram; Epstein, Ben; Demir, Nikita; Corletti, Anthony
Data Expressiveness and Its Use in Data-centric AI	Sharma, Parichit*; Kurban, Hasan; Dalkilic, Mehmet	Link
Debiasing Pre-Trained Sentence Encoders With WordDropouts on Fine-Tuning Data	Panda, Swetasudha*; Wick, Michael; Kobren, Ariel
Towards a Framework for Data Excellence in Data-Centric AI: Lessons from the Semantic Web	Seneviratne, Oshani*; Hassanzadeh, Oktie; Gruen, Daniel; McCusker, Jamie P; McGuinness, Deborah
Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering	Huang, Austin V.*	Link
Homogenization of Existing Inertial-Based Datasets to Support Human Activity Recognition	Amrani, Hamza; Micucci, Daniela; Mobilio, Marco*; Napoletano, Paolo	Link
Can machines learn to see without visual databases?	Betti, Alessandro; Gori, Marco; Melacci, Stefano*; Pelillo, Marcello; Roli, Fabio	Link
Augment & Valuate : A Data Enhancement Pipeline for data-centric AI	Lee, Youngjune*; Kwon, Oh Joon; Lee, Haeju; Kim, Joonyoung; Lee, Kangwook; Kim, Kee-Eung	Link
Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data	Chaudhari, Bhushan; Agarwal, Aakash*; Bhowmik, Tanmoy	Link
Data Agnostic Image Annotation	Mohamed Nishar, Abbaas Alif*; T V, Sethuraman; Rahman, Md Rashed; Gruteser, Marco; Mandayam, Narayan; Dana, Kristin; Jain, Shubham; Ashok, Ashwin
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs	Schuhmann, Christoph; Vencu, Richard ; Beaumont, Romain; Kaczmarczyk, Robert; Mullis, Clayton; Jitsev, Jenia; Komatsuzaki, Aran*	Link
Small Data in NLU: Proposals towards a Data-Centric Approach	Zarcone, Alessandra*; Lehmann, Jens; Habets, Emanuel	Link
On Biased Systems and Data	Vieira, Daniel*
Data vast and low in variance: Augment machine learning pipelines with dataset profiles to improve data quality without sacrificing scale	Herman, Bernease R*; Leybzon, Danny; Broomall, Jamie
CogALex 2.0: Impact of Data Quality on Lexical-Semantic Relation Prediction	Lang, Christian; Wachowiak, Lennart; Heinisch, Barbara; Gromann, Dagmar*	Link
A Data-Centric Behavioral Machine Learning Platform to Reduce Health Inequalities	Tang, Dexian; Frances, Guillem; Perianez, Africa*	Link

NeurIPS Data-Centric AI Workshop

Important Dates

Submission Deadline

Notification of acceptance

Workshop

FAQ

Call for Papers

Submission Instructions

Topics of Interest

Organizing Committee

Program

Morning Session Schedule

Afternoon Session Schedule

Invited Talks

Accepted Papers

Contact