Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Efficient CSV Parsing - On the Complexity of Simple Things

DSDSD - Dutch Seminar on Data Systems Design via YouTube

Overview

Explore CSV parsing implementations in this technical talk from the Dutch Seminar on Data Systems Design, where Pedro Holanda, a software engineer at DuckDB Labs and early DuckDB contributor, delves into the intricacies of efficient CSV parsing within DuckDB. Learn about parallel algorithms, CSV buffer management, and state machine transitions while examining various parsing implementations and their comparisons. Gain deep technical insights into the complexity of handling diverse CSV file formats, including exotic edge cases. Drawing from his experience as a PhD graduate from CWI's Database Architectures group and his expertise in Indexes for Interactive Data Analysis, Holanda provides an advanced technical discussion aimed at data system practitioners and researchers interested in database management system technology.

Syllabus

We hold bi-weekly talks on Fridays from PM to 5 PM CET for and by researchers and practitioners designing and implementing data systems. The objective is to establish a new forum for the Dutch Data Systems community to unite, foster collaborations between its members, and bring in high-quality international speakers. We would like to invite all researchers, especially PhD students, who are working on related topics to join the events. It is an excellent opportunity to receive feedback early on from researchers in your field.

Taught by

DSDSD - Dutch Seminar on Data Systems Design

Reviews

Start your review of Efficient CSV Parsing - On the Complexity of Simple Things

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.