Overview
Course Description
Building an Analytical Platform on Alibaba Cloud can empower how you take in, analyze, and demonstrate clear metrics from a set of Big Data. This course is designed to teach engineers how to use Alibaba Cloud Big Data products. It covers basic distributed system theory and Alibaba Cloud's core products like MaxCompute, DataWorks, E-MapReduce as well as a bundle of ecosystem tools.
To earn an official Alibaba Cloud certificate please join the Cloud Native courses on the Academy's website:
Big Data Analysis Specialty: https: //edu.alibabacloud.com/course/317
Machine Learning Specialty: https://edu.alibabacloud.com/course/318
Alibaba Cloud Big Data - Data Integration: https://edu.alibabacloud.com/certification/clouder_bigdatainteg
Syllabus
- Intro to Hadoop
- Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. This module will give you an introduction to Hadoop's features.
- Hadoop on Alibaba Cloud
- This module will dive deeper into the functions of Hadoop and particularly their applications through the Alibaba Cloud Platform. Courses will touch upon the uses of E-MapReduce, Hive, and Spark.
- Big Data Product Overview
- Get an overview of all of Alibaba Clouds Big Data products, their different architectures, and use scenarios.
- MaxCompute Basic
- MaxCompute (previously known as ODPS) is a general purpose, fully managed, multi-tenancy data processing platform for large-scale data warehousing. MaxCompute supports various data importing solutions and distributed computing models, enabling users to effectively query massive datasets, reduce production costs, and ensure data security. Learn the basics of this products use in this module.
- MaxCompute SQL
- MaxCompute SQL is used for offline batch computing and computing scenarios that involve gigabytes, terabytes, or exabytes of data. MaxCompute is suitable for batch jobs that process large volumes of data. Learn more about the MaxCompute SQL language and uses in this module.
- MaxCompute UDF
- MaxCompute User-Defined Functions help users customize their data engine to produce useful results. Learn how to develop functions and apply them to MaxCompute in this module.
- MaxCompute Security
- By using symmetric AccessKey pairs, MaxCompute is designed to handle security issues in multi-tenant scenarios. MaxCompute Security measures help meet the requirements for multi-user collaboration, data sharing, data confidentiality, and data security. Learn more in the module.
- Dataworks Basic
- DataWorks is a Big Data platform product launched by Alibaba Cloud. It provides one-stop Big Data development, data permission management, and offline job scheduling. The process of acquisition, processing, and monitoring are all explained in this module.
- Data Visualization
- Displaying your data in a clear and concise way is the key final step to making your data work for you. This module explains different types of graphing methods as well as gives a demo to walk users through creating their first graphs.
- PAI Overview
- The Platform for Artificial Intelligence helps users design machine learning algorithms to read large sets of data while teaching itself how to be more accurate and useful. This module gives basic architectures of PAI while teaching PAI's best practices.
Taught by
Derek Meng