
Learn more, do more and share more
he/him
Kamesh Sampath is Lead Developer Advocate at Snowflake. He focuses on learning more, doing more, and sharing more with the data community.
I care about this problem because I've spent too many years doing work that should not exist. Every new data source shows up with a slightly different schema, and suddenly smart builders are stuck writing custom ETL just to make data usable. Column names change, formats drift, pipelines break. We tell ourselves this is normal—but it quietly drains time, energy, and morale across data teams. At some point, that stopped feeling like an engineering problem and started feeling like a design failure. In this talk, I'll show how we can use AI to understand data instead of just generating it—and what that looks like in real systems. Instead of constantly fixing schema drift by hand, we can design pipelines that understand data at a semantic level and adapt safely as it evolves. This session is built around a live, code-driven demonstration. Using Apache NiFi for orchestration, LLMs for schema understanding, and Apache Iceberg for schema evolution, I'll walk through an end-to-end pipeline that ingests messy, inconsistent data and turns it into a single analytics-ready table without hand-written mappings. This is not a theoretical AI talk. It's a practical, code-first demonstration of how to stop rebuilding ETL and start building systems that evolve with the business. If you're tired of rewriting the same pipelines and explaining why one small schema change broke everything, this talk is for you.