What you'll learn:
- Tackle new challenges by understand the underlying method/approach to take
- Scrape static webpages
- Be able to scrape websites that use Javascript
- Extract all sorts of data from websites
- Know what to look for and how to approach parsing a website
- Gather data from all over the internet
- Use recursion algorithms to search through website content
Web scraping is the art of picking out data from a website by looking at the HTML code and identifying patterns that can be used to identify your data. This data can then be gathered and later used for your own analysis.
In this course we will go over the basic of web scraping and crawling, learning all about how we can extract data from websites, and all of this is guided along by a work example.
In the course will start with the simpler aspect of scraping static websites. We'll do this using requests to get the website data and use BeautifulSoup to effortlessly parse it.
Once we have a hang of the fundamentals we'll then get into dynamic websites that use Javascript to render their content. In this section of the course we'll be using Selenium to render the pages for us which will provide us with the full page of information. We'll also learn to do commonly needed things like clicking on buttons (e.g. when a page has a pop-up), or sending text into a form - in case your scraper needs to perform searches or login somewhere.
At the end of the course you should be able to go off on your own, and pick out most common websites, and be able to extract all the relevant data you may need just through using Python code.