Overview
Explore the practical applications of Gallia, a schema-aware data transformation library for Scala, in this 40-minute conference talk from Scala Days 2023 Seattle. Discover how Gallia emphasizes practicality, readability, and scalability, making it easier to transform data, create code readable by domain experts, and process big data using Apache Spark RDDs. Learn about the library's internal workings, including its two Directed Acyclic Graphs for schema and data processing. Watch live coding demonstrations showcasing use cases for both small and large datasets. Gain insights into Gallia's strengths, weaknesses, latest features like Avro/Parquet support, and future development plans. Understand how Gallia compares to alternative tools such as Pandas and Apache Spark, and determine when it might be the right choice for your data transformation needs.
Syllabus
Introduction
What is Gallia
Getting started
Goals
When to use it
Team behind Gallia
Case classes
IO support
Target selection
Live coding
Adding additional data
Transformation
Basic processing
Spark Context
Writing RDDs
Scaling
Summary
Feedback
Taught by
Scala Days Conferences