Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Large-Scale Data Extraction, Structuring and Matching Using Python and Spark

EuroPython Conference via YouTube

Overview

Explore large-scale data extraction, structuring, and matching techniques using Python and Apache Spark in this EuroPython 2017 conference talk. Learn how to tackle challenges in big data environments, including unzipping compressed archives, extracting relevant files, and extracting metadata from XML and PDF files. Discover methods for matching meta-information from different data collections, with a focus on scientific publications and user profiles across various repositories and platforms. Gain insights into the solution process for handling large-scale unzipping, file extraction from archives, and metadata extraction for performing matches, all within a big data context.

Syllabus

Deep Kayal - Large-scale data extraction, structuring and matching using Python and Spark

Taught by

EuroPython Conference

Reviews

Start your review of Large-Scale Data Extraction, Structuring and Matching Using Python and Spark

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.