Real-Time Data Analytics Workshop: From Batch to Stream

210 min•Intermediate•09:00-12:30•Max 20 participantsSydney

WorkshopReal-time AnalyticsMage AIConfluentClickHouseSnowflakeApache IcebergStarburstStreamlitClaude AIData Pipelines

Description

Building Modern Data Pipelines with Mage, Confluent, ClickHouse, and Snowflake

Abstract

This comprehensive 3-4 hour hands-on workshop is designed to take participants of all skill levels through the journey of building modern data pipelines. We'll start with batch processing and gradually move to real-time streaming, using real-world NSW Transport incident data. Learning Objectives By the end of this workshop, you will: ✅ Understand the difference between batch and streaming data processing ✅ Build a batch pipeline using Mage AI and Apache Iceberg ✅ Query data lakes using Starburst (replacing Athena) ✅ Create real-time streaming pipelines with Confluent Kafka ✅ Store and analyze streaming data in ClickHouse ✅ Build interactive dashboards with Snowflake and Streamlit ✅ Integrate AI/ML using Claude API and Snowflake Cortex AI ✅ Deploy production-ready data solutions 🏗️ What We'll Build Today Module 1: Batch Pipeline [NSW Transport API] → [Mage Batch Pipeline] → [Apache Iceberg] → [Starburst Analytics] Module 2: AI-Powered Analytics [Iceberg + Streaming Data] → [Snowflake] → [Streamlit App] → [Claude AI Assistant] Module 3: Real-Time Streaming [NSW Transport API] → [Mage Streaming] → [Confluent Kafka] → [ClickHouse] → [Real-time Dashboard]

Key Takeaways

Understand batch vs streaming data processing
Build batch pipelines with Mage AI and Apache Iceberg
Query data lakes using Starburst
Create real-time streaming pipelines with Confluent Kafka
Store and analyze streaming data in ClickHouse
Build interactive dashboards with Snowflake and Streamlit
Integrate AI/ML using Claude API and Snowflake Cortex AI
Deploy production-ready data solutions

Prerequisites

Basic understanding of data engineering concepts
Familiarity with Python
Laptop with internet access for cloud services

Required Materials

Laptop with internet access
Python 3.8+ installed
Cloud accounts (will be provided for workshop)
Basic understanding of data engineering concepts

Register for this Workshop

Secure your spot for this hands-on workshop. Limited spaces available.

Facilitator

Peter Hanssens(Facilitator)

Founder DataEngBytes

I am the founder of DataEngBytes and 6 data engineering meetups as across Australia and New Zealand as well as the Sydney Serverless meetup. I am both the Principal consultant and founder of Cloud Shuttle, a data cloud engineering consultancy based in Sydney, Australia. Ive been recognised as an AWS Serverless Hero for my community work. I am passionate about the intersection of serverless and data engineering. In my spare time, I coach my sons soccer team and enjoy family holidays to various ski fields around the world.

LinkedIn View Profile