Overview
Explore advanced Natural Language Processing techniques using Apache Spark NLP in this hands-on deep-dive session. Learn to implement state-of-the-art NLP tasks including document classification, named entity recognition, sentiment analysis, spell checking and correction, grammar understanding, question answering, and translation. Edit and run executable Python notebooks while discovering the latest advances in deep learning and transfer learning, from BERT-based embeddings to T5 transformer and MarianNMT models. Gain insights into the scalability and performance of Apache Spark NLP, which can natively utilize Apache Spark clusters and leverage modern processors from Intel and Nvidia. Understand why this open-source library is widely adopted in enterprise environments and how it achieves optimal accuracy, speed, and scalability for language understanding tasks.
Syllabus
Introduction
What is Spark NLP
NLP Libraries
NLP Pipeline
Spark NLP P3
NLP Library Website
Live Demos
Spell Correction
Accuracy
Open Source Library
Academic Paper
Optimal Accuracy
Transfer Learning
MultiClass Classification
Language Classifier
Context Spell Checker
Speed and Accuracy
Size
Optimized builds
Scaling
Benchmarks
Pipeline
Pipeline Code
Speed
Document Classifier
Fit Pipeline
More examples
Spark People Healthcare
Spark OCR
Taught by
Databricks