Data Science Survey: The Results Are In!
Posted on October 27, 2017 by Antonio Cangiano
Last week we ran a Data Science survey asking four simple questions to our community. In this post, I’ll show you the results of our survey and provide you with a Jupyter notebook; just in case you want to play with the data yourself.
2,233 people participated in the survey. This is a statistically significant participation for our students, but not the Data Science community in general. Among other factors, the Cognitive Class’ catalog of courses influences who we attract to our site and ultimately who responded to the survey.
DATA SCIENCE SURVEY Q1: WHAT’S YOUR LEVEL OF INTEREST FOR THE FOLLOWING TECHNOLOGIES?
We presented respondents with eight data-related technologies and asked them to express their level of interest for each of them. The chart below shows the results.
As expected, there is a high degree of interest (green bars) for Data Science, Big Data, and AI. Virtually everyone showed some degree of interest for these three categories.
Participants showed relatively low interest in hot technologies such as Blockchain, Virtual Reality, and Chatbots. I was somewhat surprised by this result. Though, as the author of our first Chatbot course and an enthusiast of cutting edge technology, I might be biased.
Perhaps, our learners are primarily professionals who might not have yet a concrete business application for these emerging, but still green, technologies. But this is just speculation, of course.
DATA SCIENCE SURVEY Q2: WHAT’S YOUR LEVEL OF INTEREST FOR THE FOLLOWING AREAS OF DATA SCIENCE?
Our second question drilled down to the Data Science field, asking about the level of interest for specific areas of Data Science.
The data shows a strong interest in all areas of Data Science, exception made for Data Journalism which received a lukewarm response. If you are interested in this topic, I highly recommend taking our Data Journalism course. Storytelling is underrated and I think it will benefit your Data Science career, even if you aren’t a journalist.
DATA SCIENCE SURVEY Q3: WHICH PROGRAMMING LANGUAGE FOR DATA SCIENCE ARE YOU MOST INTERESTED IN?
Our third question narrowed the scope further to the programming language of choice for Data Science.
Julia is actually a fantastic language for Data Science and I’d love to see it grow in popularity. Its performance characteristics alone are noteworthy. Unfortunately, it’s still somewhat niche in the Data Science community in general, and clearly among our students. (If you’d like to change this by authoring a course on the subject, feel free to get in touch with us.)
What’s interesting about this question is the fact that we allowed an open-ended Other option. As a result, we truly experienced the diversity of languages people adopt to perform Data Science in. In fact, our respondents also mentioned C#, Clojure, Perl, C, and a few others programming languages.
DATA SCIENCE SURVEY Q4: WHICH DATA SCIENCE TOOL ARE YOU MOST INTERESTED IN?
Finally, we asked about the primary tool or IDE of choice.
Respondents could only pick their most used tool, so it’s not surprising to see Hadoop and Spark do so well among our respondents, who showed a clear inclination for Big Data.
RStudio is also fairly popular at 15.99%, a figure somewhat in line with the results of the previous question. The primary R tool is more popular than any other Python tool among our respondents.
Please note that there is no contradiction here. Python users simply had more choices available, splitting the vote between IBM DataScience Experience (IBM DSX for short), Anaconda, and Jupyter. Combined, over 35% of respondents selected Python tools as their primary tool for Data Science, confirming that Python is at least twice as popular as R among our users.
There you have it. It will be interesting to see how these change over time. In the meantime, feel free to play with the data yourself by using the Jupyter notebook created by my colleague Alex Aklson, author of the excellent Data Visualization with Python course.
If you enroll in his course, you’ll have access to our Labs environment to run the Data Science Survey notebook in the cloud, without having to install anything on your machine. Alternatively, you can sign up with a professional Data Science tool like IBM Data Science Experience.
WHERE TO LEARN MORE
Since most of our respondents showed a great deal of interest in Data Science with Python and Big Data, allow me to recommend a couple of resources useful to learn more about these topics:
- Applied Data Science with Python (Learning Path)
- Big Data Fundamentals (Learning Path)
Antonio Cangiano is a Software Developer and AI Advocate at IBM and the author of Technical Blogging (2nd Edition): Amplify your influence. He is passionate about the craft of programming, AI, online marketing, and entrepreneurship. Keep in touch by reading his programming blog and by following him on Twitter.