Core Context

End-to-end AI Solution With PySpark & Draco

Description

Managing Real-Time Context Data
Data Transformation and Persistence using Apache NIFI
Setting up a Google Cloud Environment
- Creating a DataProc Cluster and connecting it to Jupyter Notebook
- Using Google Cloud Storage service
- Submitting a PySpark Job on DataProc
Modelling a Machine Learning Solution on PySpark for Multi-classification

Data processing is key to ensure Machine Learning models' performance. But commonly, data is collected and stored in its raw format, and to get insights from it, post-processing is required. What if all of this could be automated and managed through pipelines?

This webinar not only demonstrates how to collect data in real-time, transform it, and persist it using Draco to be ready for further use, but it also shows how to build an end-to-end AI service with PySpark hosted in the cloud.