Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to scale R for big data processing using Hadoop and Spark in this 1-hour 10-minute tutorial. Set up a Spark cluster with R installed, wrangle data stored in HDFS using R, and build and deploy machine learning models on large datasets. Discover how to utilize Microsoft R Server to enable distributed computing in R, run native R code via SSH, and set up RStudio server on a cluster. Explore techniques for data manipulation in HDFS, model building on large-scale data, and deploying models to elastically scaled web services for predictions and insights. Gain practical skills to overcome R's traditional limitations with big data and leverage its capabilities throughout the entire data science workflow.