Trino and Starburst are well known for their performance and scalability when running analytical queries, but they are equally valid platforms for executing data engineering transformation pipelines. This talk will explain how fault-tolerant execution mode adds robustness and reliability for long-running jobs that are prevalent in transformation pipelines. These pipelines can obviously be created with SQL code, but there are Python options as well. The PyStarburst and Ibis frameworks will show how data engineers can use familiar Dataframe APIs to implement their transformation logic. Finally, these options will be reviewed for their applicability to Trino and Starburst clusters.
Trino and Starburst are well known for their performance and scalability when running analytical queries, but they are equally valid platforms for executing data engineering transformation pipelines. This talk will explain how fault-tolerant execution mode adds robustness and reliability for long-running jobs that are prevalent in transformation pipelines. These pipelines can obviously be created with SQL code, but there are Python options as well. The PyStarburst and Ibis frameworks will show how data engineers can use familiar Dataframe APIs to implement their transformation logic. Finally, these options will be reviewed for their applicability to Trino and Starburst clusters.
Lead Developer Advocate, Starburst Data
Lester Martin leads the DevRel function at Starburst. He is a seasoned developer advocate, trainer, blogger, and data engineer focused on data pipelines & data lake analytics using Starburst, Trino, Iceberg, Hive, Spark, Flink, Kafka, NiFi, NoSQL databases, and, of course, classical RDBMSs. Check out Lester's blog at https://lestermartin.blog.