Overview
Learn how to unify AI and analytics workflows by running Presto SQL queries on vector data lakes powered by Lance format in this 20-minute Presto Foundation conference talk. Discover how recent advancements in GenAI, LLM, computer vision, and robotics have created unprecedented demands for computational power and data management solutions. Explore a groundbreaking approach that eliminates the need for separate data silos and query systems by leveraging Presto's distributed analytical capabilities with Lance format's vector data lake implementation. Master the technique of performing large-scale OLAP queries and data transformations directly on AI datasets used for search, retrieval, and training, eliminating the need for format conversions and complex Python scripts. Understand how this unified solution delivers 10x performance improvement on real-time search queries while maintaining compatibility with Presto's rich set of compute kernels, ultimately simplifying data management, enhancing performance, and reducing infrastructure costs.
Syllabus
Bridging the Divide: Running Presto SQL on a Vector Data Lake powered by Lance
Taught by
Presto Foundation