In-Depth Review: The Analytics Edge from MIT on edX
Detailed review by Class Central user Ilya Rudyak on a course that will make you get your hands dirty with real world data science problems and scenarios.
Review by Ilya Rudyak. Ilya is a financial and accounting consulting manager in a big consulting company. He has degrees in mathematics and economics, MBA and CPA. Currently studying computer science and programming. Looking forward to be part of the new economics environment. His MOOC transcript.
THIS COURSE IS ABOUT REAL LIFE STORIES
You’ve probably heard about Moneyball, eHarmony, the Framingham Heart Study, Twitter, IBM Watson, and Netflix. Do you want to know more about all of this? This course is by far the best way I know to get such information.
Personally I was really scared when watching how IBM Watson plays… And now I know how it works and what will happen in the near future.
WHAT IS ANALYTICS EDGE?
Well, we know a lot of terms – big data, data science, data analytics etc. What exactly is analytics edge? Truth to be told – you’ll study some very basic statistical methods: regression, clustering and trees. That’s all. But these methods really work (and I mean – REALLY work).
Just one example. If you are interested in wine you’ll probably heard of Robert Parker – the most famous wine critic. What can do regression of one variable do (from course slides)?:
Parker: 1986 is “very good to sometimes exceptional”;
Regression: 1986 is mediocre; 1989 will be “the wine of the century” and 1990 will be even better!
In wine auctions, 1989 sold for more than twice the price of 1986; 1990 sold for even higher prices!
Intro to The Analytics Edge
THIS COURSE IS FROM MIT (SO IT’S HARD)
If you’re like me and prefer study, doing this course is for you. Endless problem sets – many of them based on real data – will definitely help you in this.
Personally for me MIT courses are hard. I managed to do courses from other universities but with MIT courses usually I had problems. In this course each individual assignment is not that difficult (but may be harder if you don’t read the detail guidelines) but there are a lot of them (and a lot, and a lot). So unless you prefer a workload of 5-10 hours per week you should think twice before taking this brilliant course.
VISUALIZATION
For a long time there were some serious problems with visualization. Why? I don’t know, but now we have a real renaissance in this area. After the course I can plot a cloud of words based on their frequency (finally). But personally I was really excited when the TA plotted all of the Boston area on a simple chart.
What is this on the picture? An example of visualization you may never have thought of before. Here we try to use clustering for Disease Detection (there are a lot of medical applications in this course).
IT’S BETTER TO KNOW R IN ADVANCE
There are some popular language tools out there: Matlab, Octave, Python and R. R is a high-level language (algorithms are already implemented) and usually a good fit for non-CS people. So it’s not a big surprise that it used in this course for business students. You’ll get everything you need to know within the course. And it’s doable to finish the course without prior knowledge of R (and this is my case). But even if you know some other programming languages it can be difficult to grasp R from the first attempt – it has a lot of differences from usual languages (not to mention that it’s functional). Personally I prefer to read a book on a subject but I wasn’t able to do it within the course – it was really intensive. So my advice – go study R in advance.
KAGGLE COMPETITION
Probably one the best parts of the course is Kaggle competition – you’ll be able to understand the gap between guided problem sets and real-life situations. Don’t be discouraged if you can’t get in TOP 100 from your first attempt. It’s not that easy. Basically I spent about 50% time on this competition. And this is in the course with extremely large problem sets! Well you can spend a lot less but it’s really exciting to do real world models in non-guided environment. The main advice I have – start early. You’ll have some restriction on per day submission. Also be involved in forums and also from beginning – you’ll avoid a lot of mistakes and get some invaluable insights.
IT’S NOT A CS OR MATH COURSE
This is actually a course for business students. If you’re interested in some math background go to the Stanford course on statistical learning (and there you’ll get an interview with one of the creators of the R language). In the last two lessons you’ll get some information about linear programming. Well it’s not programming at all (and dynamic programming is also not as you know). What is it? If you’re interested you may go read about this Nobel Prize winner on the left (there are actually two winners on this subject) and secret optimization methods during WW2. It so happened I studied economics in the group with this professor. And for me this part of the course was very basic. I would say these methods were the first wave of math methods in economics. Now we have the next major paradigm shift.
Class Central is looking for reviewers and regular contributors. If you’ve ever finished a MOOC and want to write a critique to help future students considering taking that course, we want to hear from you. Drop us a mail.