Overview
Explore how to orchestrate Apache Spark jobs using Argo Workflows in a cloud-native environment through this informative conference talk. Discover the challenges of managing dependencies in large computational workloads and learn how Kubernetes and Argo Workflows provide solutions for distributed environments. Gain insights into the architecture, resource management, and workflow definitions necessary for running Spark jobs on Kubernetes. Witness demonstrations of provisioning Spark and Argo Workflows, and understand the scaling and stability advantages they offer over traditional local or cloud environments. Evaluate the pros and cons of this approach to help determine if it's suitable for your data processing needs.
Syllabus
Automating Cloud-native Spark Jobs with Argo Workflows - Caelan Urquhart & Darko Janjić, Pipekit
Taught by
Linux Foundation