Explore the automation of data ingestion and analysis for the Federal Aviation Administration's System Wide Information Management (SWIM) Program in this 26-minute conference talk. Dive into the implementation of a cloud-based solution using Azure Databricks, Apache Kafka, and Spark-XML to process public SWIM datasets including STDDS, TFMS, and TBFM. Learn how to extract valuable insights such as flight numbers, delays, and passenger impacts, and discover the potential for extending the analysis to predict various scenarios related to flights, airports, and passenger behavior. Gain knowledge about the flexible environment that can be deployed anywhere using Infrastructure as Code and Configuration Management tools. Understand the architecture, implementation process, and lessons learned from this project that supports NextGen goals and facilitates data-sharing requirements for the National Airspace System.
Overview
Syllabus
Introduction
Infrastructure
Future State Architecture
Databricks Architecture
Lessons Learned
Taught by
Databricks