Overview
Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world!
At the end of this course, you will be able to:
* Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.
* Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.
* Get value out of Big Data by using a 5-step process to structure your analysis.
* Identify what are and what are not big data problems and be able to recast big data problems as data science questions.
* Provide an explanation of the architectural components and programming models used for scalable big data analysis.
* Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.
* Install and run a program using Hadoop!
This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments.
Hardware Requirements:
(A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size.
Software Requirements:
This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
Syllabus
- Welcome
- Welcome to the Big Data Specialization! We're excited for you to get to know us and we're looking forward to learning about you!
- Big Data: Why and Where
- Data -- it's been around (even digitally) for a while. What makes data "big" and where does this big data come from?
- Characteristics of Big Data and Dimensions of Scalability
- You may have heard of the "Big Vs". We'll give examples and descriptions of the commonly discussed 5. But, we want to propose a 6th V and we'll ask you to practice writing Big Data questions targeting this V -- value.
- Data Science: Getting Value out of Big Data
- We love science and we love computing, don't get us wrong. But the reality is we care about Big Data because it can bring value to our companies, our lives, and the world. In this module we'll introduce a 5 step process for approaching data science problems.
- Foundations for Big Data Systems and Programming
- Big Data requires new programming frameworks and systems. For this course, we don't programming knowledge or experience -- but we do want to give you a grounding in some of the key concepts.
- Systems: Getting Started with Hadoop
- Let's look at some details of Hadoop and MapReduce. Then we'll go "hands on" and actually perform a simple MapReduce task using a Docker container. Pay attention - as we'll guide you in "learning by doing" in diagramming a MapReduce task as a Peer Review.
Taught by
Natasha Balac
Tags
Reviews
2.8 rating, based on 34 Class Central reviews
4.6 rating at Coursera based on 10893 ratings
Showing Class Central Sort
-
It sounds very harsh, but this "course" was a joke. I was able to complete everything on one afternoon. That was not really tricky because all you get is some basic and fuzzy facts about the subject "Big Data" and the economy related to it. Consequently, that's what's tested in order to pass the course: remembering facts of questionable value.
If we look at the videos' quality, my verdict is the same. I really recommend the lecturers to have a closer look at the books of Garr Reynolds or Nancy Duarte. Presenting ugly and unfeasible slides instead of teaching using an adequate approach really makes me sad. -
It is a very short course, without a problem yo can do it in one day and it suppose to be a 3 week course. Nevertheless, it has some interesting things, but i really expect that the subsequent courses have more content.
-
The whole specialisation is a joke. Not only did the course material for the introductory course was way too easy, the lecturers also had a hard time to illustrate their points clearly.
I really regretted having paid for the first two courses. -
This so-called course was totally devoid of any real content. It was billed as being a 3 week introduction of 5-6 hours per week but took less than 3 hours in total. Really not up to Coursera's usual standards & not figfor purpose. Avoid.
-
Total waste of time & money. Course was supposed to take 5-6 hours per week but actually only took about 45 mins.
To be avoided -
It's a very simple introduction. Definitely won't take you the full three weeks to complete. Perhaps a few days at the most.
-
Now:
UCSD course team has massively changed the course contents / structures, including changing instructors, videos and all the assessments. This course is now a pretty solid introduction to big data landscape concepts and technology. I'm happy to see the improvement and the learning value that this course has added on in response to the terrible feedbacks it got from its first launch.
1 year ago:
This course is really terrible. I think even a 5th grader can go through the entire course by himself/herself.
Little value, little technical skills taught (rated 1 star back then) -
They say that it needs no programming background - agreed. However, dont fall under the trap. merely copy pasting a command does not get you to the next level. You need to understand the fundamentals in anything you do. This course isnt that. I completed week 1, got all exited for the next course (paid 80 $) and then started running into all sorts of infrastructure issues in cloudera (cannot do this, cannot do that, cannot start Hbase etc). Tried all sorts of forms but again-- you dont know what you are doing and why you are doing. Lost interest.
PLease dont take this course. Its a huge trap. -
First weeks were too simple, as others have noted here. Final project was impossible to complete. I ran into problem after problem with the technology and googling the issues made it clear that many others encounter the same problems. They should have students start installing the software earlier in the course and provide documentation about handling common issues. I am glad I did not pay for this course.
-
I think this class would be fine for someone with a non technical background, but I found it far too simple as a professiomal programmer and was hoping for a more in depth treatment.
-
Very fundamental course. If you are looking to get very high level overview then its ok, but you will not learn much handon on big data with this course. I would say this course is a quick way to get overview of big data and mapreduce. However if you are looking for anything in-depth like how mapreduce works and how to design or implement mapreduce for your own example then this course is not useful. Also you can probably learn these things by reading about them through blogs or videos on youtube, its not worth spending money for this course. I enrolled in the big data specialisation course for which this was the first course out of the total 6 courses so I am hoping the other courses will cover the content in detail.
-
This course is much basic . No explanation of mapreduce wrt programming perspective and assignment is joke for mapreduce.But they are going to release improved version of BigData specialization by June6 2017.
-
A bit drawn out on general concepts, and could use more explanation about what a MapReduce application is. After completing this course I watched some Youtube videos to fill in the missing gaps.
-
-
-
-