Overview
Syllabus
Intro
CRISP-DM - The Cross Industry Standard Process for Data Mining is a process model with six phases that naturally describes the data science life cycle. It's like a set of guardrails to help you plan organize and implement your data science project 1. Business understanding 2. Duta understanding
Business Understanding Focus on understanding the objectives and requirements of the project. 1. Determine business objectivesYou should first thoroughly understand, from a business perspective, what the customer really wants to accomplish. CRISP.DI Gide and then define business Success criteria
Data Understanding It drives the focus to identify, collect, and analyze the data sets that can help you accomplish the project goals. It has four tasks
Data Preparation This phase, which is often referred to as "data munging", prepares the final data set(s) for modeling. It has five tasks: 1. Select data: Determine which data sets will be used and document reasons for
Modeling Here you'll likely build and assess various models based on several different modeling techniques. It has four tasks
Evaluation The Evaluation phase looks more broadly at which model best meets the business and what to do next. It has three tasks: 1. Evaluate results: Do the models meet the business success criteria Which one is should we approve for the business? 2. Review process Review the work accomplished
Deployment A model is not particularly useful unless the customer can access its results. The complexity of this phase varies widely, It has four tasks: 1. Plan deployment Develop and document a plan for deploying the model wold issues during the operational phase for post-project phase of a model
TDSP Microsoft's Team Data Science Process Launched in 2016, TDSP is an agile, iterative data science methodology to deliver predictive analytics solutions and intelligent applications efficiently." Microsoft explains that "TOSP helps improve team collaboration and learning contains a distillation of the best practices and structures from Microsoft and others in the industry that facilitate the successful implementation of data science Initiatives.
Taught by
PASS Data Community Summit