Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

High Volume PDF Text Extraction Using Python Open-Source Tools

EuroPython Conference via YouTube

Overview

Explore high-volume PDF text extraction techniques using Python open-source tools in this EuroPython 2023 conference talk. Learn about the importance of extracting information from large volumes of PDF documents for corporate decision-making and long-term forecasting. Discover how to tackle the challenges of processing unstructured data and integrating OCR capabilities. Gain insights into achieving top-tier performance and maximum extraction detail using an open-source toolset designed for Big Data scenarios. Understand the "need for speed" in text extraction and how to effectively recreate structured information from millions of pages of documents.

Syllabus

High Volume PDF Text Extraction using Python Open-Source Tools — Harald Lieder

Taught by

EuroPython Conference

Reviews

Start your review of High Volume PDF Text Extraction Using Python Open-Source Tools

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.