Scala - The Unpredicted Lingua Franca for Data Science

Overview

Explore the unexpected rise of Scala as a prominent language for data science in this conference talk from Scala Days New York 2016. Discover how the increasing volume of data and the emergence of distributed technologies have shifted the landscape of data science languages. Learn about the growing importance of Apache Spark and its impact on different communities. Examine the advantages of using Scala for machine learning and data analysis, including its interactivity, live reactivity, charting capabilities, and robustness. Follow along as the speakers demonstrate practical examples using tools like the Spark Notebook and Docker, showcasing a productive and reproducible environment for data scientists. Gain insights into the evolving role of data scientists within heterogeneous teams and the importance of agility in integrating their work into larger platforms. Compare Scala with traditional languages like Python, R, and Matlab, and understand why Scala is becoming an essential tool for modern data science workflows.

Syllabus

Intro
Why Data Science
Big Data
Business
Hadoop
Spark
Bit Data Computing
Why Scala
Mathematica Notebook
Python vs Scala
Examples
Scala Case Classes
Scala Data Frames API
Scala Sequel Query
Distributed Data Science
Team
What is missing
What we can do
Other initiatives