Learn the key concepts and skills behind one of the most important elements of data science: data mining.
Overview
Syllabus
Introduction
- Python for data mining
- What you should know
- Exercise files
- Tools for data mining
- The CRISP-DM data mining model
- Privacy, copyright, and bias
- Validating results
- Dimensionality reduction overview
- Handwritten digits dataset
- PCA
- LDA
- t-SNE
- Challenge: PCA
- Solution: PCA
- Clustering overview
- Penguin dataset
- Hierarchical clustering
- K-means
- DBSCAN
- Challenge: K-means
- Solution: K-means
- Classification overview
- Spambase dataset
- KNN
- Naive Bayes
- Decision trees
- Challenge: KNN
- Solution: KNN
- Association analysis overview
- Groceries dataset
- Apriori
- Eclat
- FP-Growth
- Challenge: Apriori
- Solution: Apriori
- Time-series mining
- Air Passengers dataset
- Time-Series decomposition
- ARIMA
- MLP
- Challenge: Decomposition
- Solution: Decomposition
- Text mining overview
- Iliad dataset
- Sentiment analysis: Binary classification
- Sentiment analysis: Sentiment scoring
- Word pairs
- Challenge: Sentiment scoring
- Solution: Sentiment scoring
- Next steps
Taught by
Barton Poulson