Completed
- Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Solving Real World Data Science Tasks With Python Beautiful Soup - Movie Dataset Creation
Automatically move to the next video in the Classroom when playback concludes
- 1 - Video overview
- 2 - Check out DataCamp! sponsored
- 3 - Setup
- 4 Task #1: Scrape the infobox from Toy Story 3 wiki page save in python dictionary
- 5 Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- 6 - Robots.txt Are you allowed to scrape a site?
- 7 - Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- 8 - Save & Load dataset checkpoint JSON file
- 9 Task #3: Clean our data!
- 10 - Task #3.1: Strip out all references [1],[2],etc from HTML
- 11 - Task #3.2: Split up the long strings
- 12 - Task #3.3: Examine errors we are getting
- 13 - Task #3.4: Convert “Running time” field to an integer
- 14 - Task #3.5: Convert “Budget” & “Box office” fields to floats
- 15 - Task #3.6: Convert dates into datetime objects
- 16 - Saving our data again using Pickle
- 17 Task #4: Attach IMDB, Metascore, and Rotten Tomatoes scores to dataset working with APIs
- 18 Task #5: Save final dataset as a JSON file and as a CSV file