What you'll learn:
- Setup a Python Environment
- Create and activate a virtual environment
- Build Python Script
- Prototype Python Script
- Extract data from webpage with Python Script
- Automatically save extracted data
Python is an interpreted high-level general-purpose programming language. Its design philosophy emphasizes code readability with its use of significant indentation. Its language constructs as well as its object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects.
Web (data extraction) scraping is the process of gathering information from the Internet. Even copying and pasting the lyrics of your favourite song is a form of web scraping! However, the words “web scraping” usually refer to a process that involves automation. Some websites don’t like it when automatic scrapers gather their data, while others don’t mind.
If you’re scraping a page respectfully for educational purposes, then you’re unlikely to have any problems. Still, it’s a good idea to do some research on your own and make sure that you’re not violating any Terms of Service before you start a large-scale project.
You can scrape any site on the Internet that you can look at, but the difficulty of doing so depends on the site.
This course offers you an introduction to web scraping to help you understand the overall process. Then, you can apply this same process for every website you’ll want to scrape..
Before you write any Python code, you need to get to know the website that you want to scrape. That should be your first step for any web scraping project you want to tackle. You’ll need to understand the site structure to extract the information that’s relevant for you. Start by opening the site you want to scrape with your favourite browser.