Data Science as a Service | Kumo AI Full Walkthrough

The video provides a detailed walkthrough of Kumo AI, a data science as a service platform that leverages graph neural networks to simplify and accelerate complex predictive analytics tasks such as customer lifetime value prediction, personalized product recommendations, and purchase forecasting using e-commerce data. It demonstrates how Kumo streamlines data integration, model training, and prediction deployment through an intuitive interface and a SQL-like Predictive Query Language, enabling users to generate actionable business insights quickly without requiring deep data science expertise.

The video provides a comprehensive end-to-end walkthrough of Kumo AI, a data science as a service platform designed to simplify complex analytics tasks typically undertaken by data scientists. Using an e-commerce example, the presenter explains how Kumo can predict customer lifetime value, generate personalized product recommendations, and forecast purchase behaviors over a future period, such as the next 30 days. The example dataset used is the H&M e-commerce dataset, which includes three main tables: customers, transactions, and articles (products). The challenge lies in connecting these large datasets and extracting meaningful business predictions, a task that Kumo streamlines significantly.

A key technical highlight is the use of graph neural networks (GNNs) for modeling the relationships between customers, transactions, and products. GNNs are particularly effective because they capture network effects, temporal dynamics, and can address the cold start problem—where new customers with limited data can still receive reasonable predictions based on similarities to existing customers. Kumo abstracts away the complexity of data ingestion, cleansing, preprocessing, and model training, enabling users to complete in hours what might otherwise take weeks or months, while often delivering superior predictive performance.

The video then walks through the practical setup of Kumo, including obtaining an API key, connecting to data sources like Google BigQuery, and uploading the H&M dataset. The presenter demonstrates how to define source tables and their relationships within Kumo, creating a graph that links customers to transactions and transactions to articles. This setup allows Kumo to understand the data structure and prepare for training the GNN. The user interface and metadata exploration features in Kumo are also showcased, providing insights into the dataset’s schema and contents.

Next, the presenter introduces Kumo’s Predictive Query Language (PQL), a SQL-like syntax for defining prediction tasks. Three main use cases are demonstrated: predicting customer value over the next 30 days, generating personalized product recommendations (top 10 likely purchases per customer), and forecasting purchase volume for active customers. Each PQL query is validated, and Kumo generates an optimized model training plan. The training jobs are submitted asynchronously, with progress monitored via the Kumo dashboard. Once training completes, predictions are stored in BigQuery tables for further analysis.

Finally, the video shows how to query and combine prediction results to identify the most valuable customers, their likely purchases, and expected transaction volumes. By joining prediction outputs with original customer and product data, actionable insights are generated that can be used for targeted marketing or personalized recommendations on e-commerce platforms. The presenter emphasizes that Kumo not only accelerates the data science workflow but also democratizes advanced analytics, enabling engineers and product teams to leverage world-class predictive models without deep expertise in graph neural networks or extensive data science experience.