What you'll learn:
- Understand the differences between processing data using CPU and GPU
- Use cuDF as a replacement for pandas for GPU-accelerated processing
- Implement codes using cuDF to manipulate DataFrames
- Use cuPy as a replacement for numpy for GPU-accelerated processing
- Use cuML as a replacement for scikit-learn for GPU-accelerated processing
- Implement a complete machine learning project using cuDF and cuML
- Compare the performance of classic Python libraries that run on the CPU with RAPIDS libraries that run on the GPU
- Implement projects with DASK for parallel and distributed processing
- Integrate DASK with cuDF and cuML for GPU performance
Data science and machine learning represent the largest computational sectors in the world, where modest improvements in the accuracy of analytical models can translate into billions of impact on the bottom line. Data scientists are constantly striving to train, evaluate, iterate, and optimize models to achieve highly accurate results and exceptional performance. With NVIDIA's powerful RAPIDS platform, what used to take days can now be accomplished in a matter of minutes, making the construction and deployment of high-value models easier and more agile. In data science, additional computational power means faster and more effective insights. RAPIDS harnesses the power of NVIDIA CUDA to accelerate the entire data science model training workflow, running it on graphics processing units (GPUs).
In this course, you will learn everything you need to take your machine learning applications to the next level! Check out some of the topics that will be covered below:
Utilizing the cuDF, cuPy, and cuML libraries instead of Pandas, Numpy, and scikit-learn; ensuring that data is processed and machine learning algorithms are executed with high performance on the GPU.
Comparing the performance of classic Python libraries with RAPIDS. In some experiments conducted during the classes, we achieved acceleration rates exceeding 900x. This indicates that with certain databases and algorithms, RAPIDS can be 900 times faster!
Creating a complete, step-by-step machine learning project using RAPIDS, from data loading to predictions.
Using DASK for task parallelism on multiple GPUs or CPUs; integrated with RAPIDS for superior performance.
Throughout the course, we will use the Python programming language and the online Google Colab. This way, you don't need to have a local GPU to follow the classes, as we will use the free hardware provided by Google.