Spark NLP - State-of-the-Art Natural Language Processing at Scale

Overview

Explore the cutting-edge capabilities of Spark NLP in this 30-minute conference talk from Databricks. Dive into the world of state-of-the-art natural language processing at scale, learning how to leverage Apache Spark for efficient NLP tasks. Discover how Spark NLP seamlessly integrates with Spark ML pipelines, enabling distributed, zero-copy NLP and ML workflows. Gain insights into core NLP algorithms, including lemmatization, part of speech tagging, dependency parsing, named entity recognition, spell checking, and sentiment detection. Understand the implementation of BERT embeddings for named entity recognition and explore post-BERT embeddings for multi-lingual and multi-domain natural language understanding. Witness live demonstrations of building common NLP pipelines using PySpark in notebooks, and learn about recent accuracy benchmarks against state-of-the-art results. Acquire valuable knowledge on design best practices for constructing NLP, ML, and deep learning pipelines on Spark, equipping you with the tools to tackle complex language processing challenges in data science systems.