Alkis Polyzotis, Matei Zaharia
Data-centric AI is a new and exciting research topic in the AI community, but many organizations already build and maintain various data-centric” applications whose goal is to produce high quality data. These range from traditional business data processing applications (e.g.,how much should we charge each of our customers this month?”) to production ML systems such as recommendation engines. The fields of data and ML engineering have arisen in recent years to manage these applications, and both include many interesting novel tools and processes. In this paper, we discuss several lessons from data and ML engineering that could be interesting to apply in data-centric AI, based on our experience building data and ML platforms that serve thousands of applications at a range of organizations.