本课程完整覆盖数据挖掘领域的各项核心技术,包括数据预处理、分类、聚类、回归、关联、推荐、集成学习、进化计算等。强调在知识的广度、深度和趣味性之间寻找最佳平衡点,在生动幽默中讲述数据挖掘的核心思想、关键技术以及一些在其它相关课程和教科书中少有涉及的重要知识点。本课程适合对大数据和数据科学感兴趣的各专业学生以及工程技术人员学习,不追求纯粹的理论推导,而是把理论与实践有机结合,让学生学到活的知识、有用的知识和真正属于自己的知识,特别是数据分析领域的研究方法和思维方式。
Despite the large volume of data mining papers and tutorials available on the web, aspiring data scientists find it surprisingly difficult to locate an overview that blends clarity, technical depth and breadth with enough amusement to make big data analytics engaging. This course does just that.
Each module starts with an interesting real-world example that gives rise to the specific research question of interest.
Students are then presented with a general idea of how to tackle this problem along with some intuitive and straightforward approaches.
Finally, a number of representative algorithms are introduced along with concrete examples that show how they function in practice.
While theoretical analysis sometimes overcomplicates things for students, here it’s applied to help them better understand the key features of the techniques.