In-Depth Review: Data Analysis Take it to the Max() from Delft University
Detailed review by Class Central user Adelyne Chan on a course that gets deep into Excel spreadsheets and explores visualizations and python integration.
Review by Adelyne Chan. Took the course? Write your own review here. Read all reviews. Adelyne is a part-time MSc student studying bioinformatics, part time MOOC addict. She has a first degree in cell biology and is looking forward to beginning her PhD in cancer biology, but finds that her life has been greatly enriched by the variety of MOOCs taken so far. The best part of her MOOC so far has been learning how to program, as well as interacting with the wider online community.
This course is offered by the Spreadsheet Research team at the Delft University, and therefore is targeted at covering the topics required to help students design better spreadsheets in the future. As reiterated various times throughout the course: “The average lifespan of a spreadsheet is 5 years, during which time it is used by 12 different people” – thus emphasising the requirement not only for accurate spreadsheet design but also for the detection of errors should they arise when the spreadsheet is used by individuals other than the one who designed it.
Introduction to Data Analysis: Take It to the MAX()
WHY?
The topic of spreadsheets is an extremely interesting one, nearly everyone who uses a computer these days uses spreadsheets in one way or another, yet in the MOOC world courses which touch on spreadsheet design and use are much rarer than those offered for the various programming languages, despite the fact that a smaller subset of computer users today write their own programs. This could be because spreadsheets have traditionally been seen, and indeed this is the way I had previously viewed them, as a fairly self-explanatory tool rather than one which requires instruction in order to operate. Interestingly as well, this course was not targeting individuals who had no knowledge of spreadsheet use at all, as their prerequisites involved being familiar with a spreadsheet environment (although familiarity in this respect refers to using fairly basic tasks, as the example given was the ability to use the SUM function in Excel).
As a bioinformatics student, often the tasks which I need to complete require the use of more specialised software or programming tools in order to solve, but spreadsheets remain an effective way of carrying out simple calculations and data analysis. The universality of spreadsheets in various fields has not been diminished by the development of more specialised tools, which are often field-specific or even problem specific. Despite being fairly comfortable with the use of spreadsheets for my own purposes, I was interested to see what sorts of skills would be taught in this course, and to see how these could be incorporated into better spreadsheet design for the future.
I was also interested in one of the items on the syllabus, the integration of Python (my favourite programming language) with spreadsheets (the software which I most frequently use to share data analysis with family and colleagues) and I was interested to learn how the versatility of Python could be applied to further improve spreadsheet use.
FELIENNE HERMANS
In the first week, the MOOC lead Felienne Hermans introduced herself and her team in the Delft University Spreadsheet Lab who research methods of software engineering which can then be applied to the design of spreadsheets (see also Felienne’s personal website).
SKILLS
I have been using Microsoft Excel for quite a number of years now and consider myself comfortable with the basic workings of a spreadsheet, which seemed to fit the “who is this course for” to a tee. Most of my Microsoft Excel skills prior to this course were self / informally taught, and I thought it would be fun to take a course in which the various capabilities of Excel would be systematically discussed.
SUCCESS
Despite the prerequisites of this course being “only a basic understanding of Excel”, while certainly sufficient to complete the first 6 weeks of the course (which are based entirely in Excel) and thus getting a high enough score for a certificate, I don’t quite think this prerequisite is sufficient to understand all the topics covered in the syllabus.
The last two topics – DataNitro (a shareware used to integrate Python with Excel) and Neo4j (a community-licensed graphing tool for visualizing relationships between rows in Excel) – take the user outside the familiar realms of Excel itself.
The last two topics – DataNitro (a shareware used to integrate Python with Excel) and Neo4j (a community-licensed graphing tool for visualizing relationships between rows in Excel) – take the user outside the familiar realms of Excel itself. While the description of this, and the prospect of working past Excel’s inbuilt limitations without learning a whole new tool, certainly sounds very exciting, the extremely poor level of teaching in these latter modules led instead to a fair bit of frustration. This is indeed a pity, as at the midpoint of the course I felt that this course was the best MOOC I’d ever taken, only to give up midway through the final week because I simply could not get the external software to work.
These problems don’t seem to be limited to those faced by me, there were numerous threads being started in the Discussion Forums. Some of the issues could be solved by Staff, however most of the Teaching Assistants were rude and dismissive of the problems, choosing instead to provide generic solutions to problems which often did not address the specific problem faced.
THE COURSE
Felienne Hermans is one of the more enthusiastic MOOC tutors I have seen out there
Felienne Hermans is one of the more enthusiastic MOOC tutors I have seen out there, and the course similarly hits the ground running with the presentation of the “Bacon Number Problem” even before the course start date and the promise that students who successfully complete the course would be able to solve this problem with ease. I particularly liked the way in which the course was presented, consisting of short and concise lectures covering the main topics and then exercises to practice the topics taught.
I particularly liked the way in which the course was presented, consisting of short and concise lectures covering the main topics and then exercises to practice the topics taught.
Data was provided for download (sometimes in a truncated version to reduce processing time) and students were expected to work through the techniques taught to answer the questions related to the dataset. The examples selected were interesting as they were very real world e.g. the register of a hardware store or scheduling in a dance studio, and the questions asked in the exercises were very practical (a question that you would expect to be interested in if you were in the shoes of the “data provider”) e.g. a hardware store owner may wish to group their products into those which provided a high profit margin against those which were not doing so well. There were also bonus exercises each week which brought together all the concepts taught so far in the course in a real world example often extracted from real requests for help on the web.
Unfortunately, the course started to go slightly downhill from the halfway point, as I found that the mode of instruction was not too tolerant about different people having different methods to achieve the same objective. For example, in the “testing” module, students were taught the ways (and the importance) of having check functions in their spreadsheet which would pick up and alert users to potential errors in the spreadsheet. While I agree that having these testing methods built into a spreadsheet is a good idea and good practice, there are many possible ways in which this testing can be achieved. Yet, the way in which the answer was to be entered allowed for only a single method i.e. the instructor’s method. Given the widely subjective nature of testing in general, it would probably have been more appropriate to provide students with error-laden spreadsheets and have them identify errors using whatever means possible.
Another point to note is one of the “selling points” of the course is learning how to integrate the use of spreadsheets (Excel) with the Python programming language. As mentioned earlier, this was one of the points which attracted me to taking this course. When we reached the “Python Week” I was extremely disappointed to find that the way in which this was achieved is using an Excel plugin called DataNitro which is not freely available, yet no mention of this was made earlier in the course. Kudos to the instructors for securing a trial for students for the duration of the course, allowing course completion without a purchase, but in the long run those who intend to apply this knowledge to their work problems would need to purchase the DataNitro package.
When this use of shareware instead of freeware was brought up by a fellow disappointed student on the Forums, instead of providing a professional and diplomatic response the teaching assistant instead provided a rude and sarcastic response, justifying this with the fact that “lucky for me, this is not my profession!” and that the response was merely “an attempt at a witty retort peer to peer”. While I agree that friendly peer to peer banter is a mainstay (and benefit) of MOOCs in general which engage a large audience on a common platform, a nominated Community Teaching Assistant should be expected to display a higher level of professionalism, at least within the realms of the course.
GRADING
Most of the quiz questions involved downloading a spreadsheet of provided data, in line with the “theme” of the week, and answering questions relating to the data itself. Questions of this nature generally have unambiguous answers and therefore there was not much room to quibble. Answer models were provided after the due date so that those who did not manage to obtain the answers on their own could see how they were derived and, using the provided formulas, re-attempt the problem on their own.
There were some questions, however, which seemed to hinge on opinion e.g. Is Excel the “appropriate” tool for achieving a particular task, where appropriateness could be seen as subjective and therefore I don’t think I necessarily agreed with all of the provided answers.
TIME COMMITMENT
In the earlier weeks I spent around 5 hours per week on this course, which within their suggested time commitment. However, as the quality of teaching deteriorated in the latter weeks I found myself losing track of the hours I spent on the course.
CONCLUSION
I really enjoyed the first 4 weeks of the course and find them incredibly useful, and I hope that they will help to improve my spreadsheet design in the future. I would thus recommend this course for other students who intend to selectively audit the material and learn only specific techniques. Moving towards the latter weeks of the course, following the material and exercises may become increasingly frustrating for students intending to fully engage with the material right till the end, because the material gets harder and harder to engage with. I did get a certificate from this course, as by around Week 6 I had acquired a sufficiently high score to be awarded one. However, I still feel that I did not achieve 8 weeks’ worth of knowledge from the time spent on this course. For students who wish to dip into the material without committing to the course, all of the lecture videos are available freely on YouTube.
WHAT NEXT?
This course is incredibly unique in that I don’t think I have come across another MOOC which provides modular teaching of Microsoft Excel in this manner. However, those who are interested in data analysis may be interested to dig deeper into using other programming languages, such as R (Johns Hopkins University has an entire specialization on data science provided on Coursera).
For those who enjoyed the use of DataNitro and are interested to explore the utility of Python outside of spreadsheets, I highly recommend the introductory course offered by MITx or the slightly more advanced (but more fun, in my opinion!) Coursera course offered by Rice University.
Class Central is looking for reviewers and regular contributors. If you’ve ever finished a MOOC and want to write a critique to help future students considering taking that course, we want to hear from you. Drop us a mail.