Tools for Data Science

IBM via Coursera

Go to class Write review

Details

Go to class

Provider

Coursera
Pricing

Free Online Course (Audit)
Languages

English
Certificate

Paid Certificate Available
Duration & workload

18 hours 48 minutes
Sessions

On-Demand
Level

Beginner
Subtitles

Arabic, French, Portuguese, Italian, German, Russian, English, Spanish, Farsi, Thai, Indonesian, Kazakh, Hindi, Swedish, Korean, Greek, Chinese, Ukrainian, Japanese, Polish, Dutch, Turkish, Hungarian, Bengali, Pashto, Urdu, Azerbaijani

Found in

Part of

Overview

In order to be successful in Data Science, you need to be skilled with using tools that Data Science professionals employ as part of their jobs. This course teaches you about the popular tools in Data Science and how to use them. You will become familiar with the Data Scientist’s tool kit which includes: Libraries & Packages, Data Sets, Machine Learning Models, Kernels, as well as the various Open source, commercial, Big Data and Cloud-based tools. Work with Jupyter Notebooks, JupyterLab, RStudio IDE, Git, GitHub, and Watson Studio. You will understand what each tool is used for, what programming languages they can execute, their features and limitations. This course gives plenty of hands-on experience in order to develop skills for working with these Data Science Tools. With the tools hosted in the cloud on Skills Network Labs, you will be able to test each tool and follow instructions to run simple code in Python, R, or Scala. Towards the end the course, you will create a final project with a Jupyter Notebook. You will demonstrate your proficiency preparing a notebook, writing Markdown, and sharing your work with your peers.

Syllabus

Overview of Data Science Tools

In this module, you will learn about the different types and categories of tools that data scientists use and popular examples of each. You will also become familiar with Open Source, Cloud-based, and Commercial options for data science tools.

Languages of Data Science

For users who are just starting on their data science journey, the range of programming languages can be overwhelming. So, which language should you learn first? This module will bring awareness about the criteria that would determine which language you should learn. You will learn the benefits of Python, R, SQL, and other common languages such as Java, Scala, C++, JavaScript, and Julia. You will explore how you can use these languages in Data Science. You will also look at some sites to locate more information about the languages.

Packages, APIs, Data Sets, and Models

In this module, you will learn about the various libraries in data science. In addition, you will understand an API in relation to REST request and response. Further, in the module, you will explore open data sets on the Data Asset eXchange. Finally, you will learn how to use a machine learning model to solve a problem and navigate the Model Asset eXchange.

Jupyter Notebooks and JupyterLab

With the advancement of digital data, Jupyter Notebook allows a Data Scientist to record their data experiments and results that others can reuse. This module introduces the Jupyter Notebook and Jupyter Lab. You will learn how to work with different kernels in a Notebook session and about the basic Jupyter architecture. In addition, you will identify the tools in an Anaconda Jupyter environment. Finally, the module gives an overview of cloud based Jupyter environments and their data science features.

RStudio & GitHub

R is a statistical programming language and is a powerful tool for data processing and manipulation. This module will start with an introduction to R and RStudio. You will learn about the different R visualization packages and how to create visual charts using the plot function. In addition, Distributed Version Control Systems (DVCS) have become critical tools in software development and key enablers for social and collaborative coding. While there are many distributed versioning systems, Git is amongst the most popular ones. Further in the module, you will develop the essential conceptual and hands-on skills to work with Git and GitHub. You will start with an overview of Git and GitHub, followed by creation of a GitHub account and a project repository, adding files to it, and committing your changes using the web interface. Next, you will become familiar with Git workflows involving branches and pull requests (PRs) and merges. You will also complete a project at the end to apply and demonstrate your newly acquired skills.

Create and Share your Jupyter Notebook

In this module, you will work on a final project to demonstrate some of the skills learned in the course. You will also be tested on your knowledge of various components and tools in a Data Scientist's toolkit learned in the previous modules.

[Optional] IBM Watson Studio

Watson Studio is a collaborative platform for the data science community and is used by Data Analysts, Data Scientists, Data Engineers, Developers, and Data Stewards to analyze data and construct models. In this module, you will learn about Watson Studio and IBM Cloud Pak for data as a service. Then you will create an IBM Watson Studio service and a project in Watson Studio. After creating the project, you will create a Jupyter notebook and load a data file. You will also explore the different templates and kernels in a Jupyter notebook. Finally, you will connect your Watson Studio account to GitHub and publish the notebook in GitHub. Note: This part of the course is optional and is not a mandatory requirement to complete the lab provided in this week of the course.

Taught by

Polong Lin

Reviews

1.3 rating, based on 3 Class Central reviews

4.5 rating at Coursera based on 29280 ratings

Start your review of Tools for Data Science

Anonymous

There are big problems with course, because videos are almost old for IBM Watson Studio, the platform were updated long time ago and no-one correct the instructions and videos. Moreover payment are not refundable, so it was just waste of money and time. Because of the old instructions you can't pass the assignment and earn certificate
Anonymous

The class was sloppy, poorly organized, failed to provide any learning materials, failed to teach foundational concepts, and was overly-focused on IBM products. The worst online class I have ever taken.
Anonymous

This class is like a laundry list of tools. While I could see the utility in awareness of tools in the field, it's difficult to retain anything useful. Also, the class is very focused on IBM products.

Go to class

Taught by

Tags

Data Science Tools

Data Science Tools

Introduction to R Programming for Data Science

Introduction to Data Science

The Data Scientist’s Toolbox

IBM Data Science

10 Best Data Science Courses

10 Best Free Programming Courses

1800+ Coursera Courses That Are Still Completely FREE

250 Top FREE Coursera Courses of All Time

Massive List of MOOC-based Microcredentials

Never Stop Learning.