Completed
- Task #3.2: Split up the long strings
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Solving Real World Data Science Tasks With Python Beautiful Soup - Movie Dataset Creation
Automatically move to the next video in the Classroom when playback concludes
- 1 - Video overview
- 2 - Check out DataCamp! sponsored
- 3 - Setup
- 4 Task #1: Scrape the infobox from Toy Story 3 wiki page save in python dictionary
- 5 Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- 6 - Robots.txt Are you allowed to scrape a site?
- 7 - Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- 8 - Save & Load dataset checkpoint JSON file
- 9 Task #3: Clean our data!
- 10 - Task #3.1: Strip out all references [1],[2],etc from HTML
- 11 - Task #3.2: Split up the long strings
- 12 - Task #3.3: Examine errors we are getting
- 13 - Task #3.4: Convert “Running time” field to an integer
- 14 - Task #3.5: Convert “Budget” & “Box office” fields to floats
- 15 - Task #3.6: Convert dates into datetime objects
- 16 - Saving our data again using Pickle
- 17 Task #4: Attach IMDB, Metascore, and Rotten Tomatoes scores to dataset working with APIs
- 18 Task #5: Save final dataset as a JSON file and as a CSV file