Overview
Explore the new features and improvements of DataSource V2 for the Spark Cassandra Connector in this informative 39-minute talk. Discover how speed, flexibility, and usability enhancements can benefit your Cassandra and Spark integration. Learn about Spark's ability to understand Cassandra's internal clustering, manipulating the Cassandra catalogue directly from Spark, and other significant highlights. Dive into topics such as RDD-backed DataSourceV1, catalog features, catalog architecture, setting up and using catalogs, inspecting table metadata, managing multiple catalogs for one or multiple clusters, creating tables and keyspaces, and leveraging Cassandra table options. Gain insights into partitioning, TTL/Writetime support, InClause to Join functionality, and Astra / Java Driver 4.0 integration. Conclude with a look at future plans and an invitation to contribute to the open-source project.
Syllabus
Intro
Russell Spitzer Type Type Code Code
DataSourceV1 - RDD Backed
Catalog Features Infinite Catalogs Allowed
Catalog Architecture . Catalog Connects to Cassandra
Catalog Setup
Setting up and Using a Catalog
Inspecting Table Metadata
Setting up Multiple Catalogs for one Cluster A more complicated example
Setting up Multiple Catalogs for Multiple Clusters
Create Tables - Create Keyspaces
All Cassandra Table Options are Available
Partitioning
TTL/Writetime Support
InClause to Join
Astra / Java Driver 4.0
Future Plans
OSS needs you!
Taught by
Databricks