Machine learning and RPA company DataRobot today announced the latest release of the DataRobot Enterprise AI Platform, which includes the changes for automated feature engineering and the debut of AI Catalog, a service that lets enterprise teams search customer data in order to build and deploy AI models or make predictions.
AI Catalog utilizes tech from Cursor, a company acquired in February by DataRobot for an undisclosed amount. Cursor was born out of experiences at LinkedIn, where Cursor cofounder and DataRobot VP of product management Adam Weinstein led an analytics team. The challenge at that time was helping data scientists and engineers find scattered data in a large organization and then understand how to make data searchable and sharable to enable team collaboration.
AI Catalog will seek to achieve the same goals.
The update will also include automated feature engineering, which creates features that enable the enhancement of RPA by sharing related or secondary data sets. Automatic feature generation will grow more powerful with the growth of AI Catalog, Weinstein told VentureBeat.
“We actually don’t want you to even have to do that [identify secondary data sets] in the long run. [Today] I think there’s sort of this chicken and egg problem, but once users start using the catalog and the data is populated there, we can actually look to that catalog, automatically identify those data sets, and do the whole thing without any user assistance,” he said.
Automated feature engineering can help enforce governance within organizations to implement common standards and definitions like, for example, ensuring a common definition of customer churn.
Data scientists will also be able to use the Enterprise AI Platform with Apache Spark SQL to combine multiple data sets from Hadoop, disparate text, or other sources in an AI Catalog.
“I can actually combine all those within DataRobot without leaving the platform, using Spark SQL. And then we’ll emit a new data set that you can then use for projects to create new models or create predictions,” Weinstein said.
The platform update also includes MLOps, a service introduced last month that takes existing DataRobot services for AI and combines them with tools from machine learning operations company ParallelM, which was acquired by DataRobot in June. The service operates with Apache Spark and Kubernetes and comes with tools designed to help organizations deploy models in production — such as a dashboard for automatically identifying systems that need to be retrained to improve performance.
Despite heavy investment in AI talent and many insisting that we now live in an AI world, most businesses still struggle to deploy AI in production. According to a November 2018 PricewaterhouseCoopers survey, only 4% of business executives reported successfully deploying AI systems.
Earlier this month, DataRobot raised a $206 million funding round, bringing its total funding raised to more than $400 million.