In the capstone, students will engage on a real world project requiring them to apply skills from the entire data science pipeline: preparing, organizing, and transforming data, constructing a model, and evaluating results. Through a collaboration with Coursolve, each Capstone project is associated with partner stakeholders who have a vested interest in your results and are eager to deploy them in practice. These projects will not be straightforward and the outcome is not prescribed -- you will need to tolerate ambiguity and negative results! But we believe the experience will be rewarding and will better prepare you for data science projects in practice.
Overview
Syllabus
- Project A: Blight Fight
- In this project, you will build a model to predict when a building is likely to be condemned. The data is real, the problem is real, and the impact is real.
- Week 2: Derive a list of buildings
- You are given sets of incidents with location information; you need to use some assumptions to group these incidents by location to identify specific buildings.
- Week 3: Construct a training dataset
- Construct a training set by associating each of your buildings with a ground truth label derived from the permit data.
- Week 4: Train and evaluate a simple model
- Use a trivial feature set to train and evaluate a simple model
- Week 5: Feature Engineering
- Derive additional features and retrain to improve the efficacy of your model.
- Week 6: Final Report
- Enter your final report for grading.
Taught by
Bill Howe