Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for packaging Python code in distributed computing environments through this conference talk. Dive into various methods for deploying Python code to compute clusters, examining the role of Python's pickling feature and self-contained executables. Learn about the complexities of shipping code to large-scale clusters with thousands of nodes running jobs like TensorFlow or Spark. Discover how to execute a PySpark job on S3 storage using PEX as a self-contained executable artifact. Gain insights into generalizing these concepts for different job types, virtual environments, and distributed storage systems. Walk away with an understanding of Python packaging challenges for distributed applications and practical code samples applicable to your own projects.