What you'll learn:
- Understand the basic concepts of natural language processing, such as: part-of-speech, lemmatization, stemming, named entity recognition, and stop words
- Understand more advanced concepts, such as: dependency parsing, tokenization, word and sentence similarity
- Load texts from the Internet to apply natural language processing techniques
- How to visualize the most frequent terms using wordcloud
- Implement text summarization and keyword search
- Learn how to represent texts using Bag of Words and TF-IDF
- Implement sentiment analysis using NLTK library (natural language toolkit), TF-IDF and spaCy library
The area of Natural Language Processing (NLP) is a subarea of Artificial Intelligence that aims to make computers capable of understanding human language, both written and spoken. Some examples of practical applications are: translators between languages, translation from text to speech or speech to text, chatbots, automatic question and answer systems (Q&A), automatic generation of descriptions for images, generation of subtitles in videos, classification of sentiments in sentences, among many others! Learning this area can be the key to bringing real solutions to present and future needs!
Based on that, this course was designed for those who want to grow or start a new career in Natural Language Processing, using the spaCy and NLTK (Natural Language Toolkit) libraries and the Python programming language! SpaCy was developed with the focus on use in production and real environments, so it is possible to create applications that process a lot of data. It can be used to extract information, understand natural language and even preprocess texts for later use in deep learning models.
The course is divided into three parts:
In the first one, you will learn the most basic natural language processing concepts, such as: part-of-speech, lemmatization, stemming, named entity recognition, stop words, dependency parsing, word and sentence similarity and tokenization
In the second part, you will learn more advanced topics, such as: preprocessing function, word cloud, text summarization, keyword search, bag of words, TF-IDF (Term Frequency - Inverse Document Frequency), and cosine similarity. We will also simulate a chatbot that can answer questions about any subject you want!
Finally, in the third and last part of the course, we will create a sentiment classifier using a real Twitter dataset! We will implement the classifier using NLTK, TF-IDF and also the spaCy library
This can be considered the first course in natural language processing, and after completing it, you can move on to more advanced materials. If you have never heard about natural language processing, this course is for you! At the end you will have the practical background to develop some simple projects and take more advanced courses. During the lectures, the code will be implemented step by step using Google Colab, which will ensure that you will have no problems with installations or configurations of software on your local machine.