Of the many data roles in the growing job market for data professionals, which data career is right for you? To help you answer this question, in this article we will compare what Data Scientists, Data Analysts, and Data Engineers do, what skills are required for each, salary profiles, and how to get started in each of these fields. We will also look at some common attributes between these roles.
You may be familiar with the phrase “Data is the new oil” – which implies that data is a valuable commodity. For this commodity to be utilized and deliver value for an organization, it needs to be identified, extracted, refined, stored, and mined for insights.
Key roles in the Data Ecosystem
While the lure of leveraging data for competitive edge is transforming organizations to become more data driven in their operations and decisions, it has resulted in an enormous job opportunity for various data related professions. In the video below (from the course Introduction to Data Engineering) you will become familiar with the key roles in the data ecosystem and the skills they require.
Summarizing the video, the key data domains include Data Engineering, Data Analytics and Data Science. Data Engineering entails managing data throughout its lifecycle and includes the tasks of designing, building, and maintaining data infrastructures. These data infrastructures can include databases – relational and NoSQL, Big Data repositories and processing engines - such as Hadoop and Spark, as well as data pipelines – for transforming and moving data between these data platforms.
Data Analytics involves finding the right data in these data systems, cleaning it for the purposes of the required analysis, and mining data to create reports and visualizations. And Data Scientists take Data Analytics even further by performing deeper analysis on the data and developing predictive models to solve more complex data problems.
Skills for Data Professions
In terms of skills required to perform each of these roles, while there are some unique skills for each job role, there also some common skills that all data professionals need, however the level of proficiency required may vary. For example, all these professions require familiarity with various data sources and data formats, some programming knowledge for processing data using languages such as Python or R, and ability to access data using different tools and languages like SQL.
Data Engineers need to know a lot more about underlying computing environments and data repository architectures. They not only need to be able to query these repositories but also load data into them and manage them. So they need to be familiar with management and administrative tools.
Data Analysts, on the other hand, focus on querying and analysis. They need to utilize tools including spreadsheets like Excel, as well as reporting and dashboarding tools like Tableau, PowerBI and Cognos Analytics. Modern Data Analysts also use tools such as Jupyter Notebooks and R Studio that offer richer programmatic ability than the reporting tools.
Data Scientists perform deeper analysis and need greater proficiency in tools like Jupyter and less so with reporting tools. They also develop machine learning models and therefore need to be familiar with various machine learning libraries and algorithms.
One trait that is common to all these professions is the ability to keep up with evolving technologies and tools in their domains. For example, while 20 years ago it may have been sufficient for a Data Engineer to have knowledge of relational databases and data warehouses, today a data engineer also needs to work with NoSQL databases, big data systems and data lakes. Similarly, while a Data Analyst from a decade ago could work with just CSV files and relational tables, today a Data Analyst also needs to work with JSON data and query NoSQL databases.
In addition to technical skills, employers highly value soft skills such as effective communication and teamwork. While presentation skills are valuable for all these professions, Data Analysts in particular need to be really good storytellers. And some domain knowledge will help you stand out even more. For instance, for a Data Scientist to develop a model to detect credit card fraud, they would greatly benefit by being familiar with the credit card industry, the terms and language used, and the processes involved.
Salary Profiles and Mobility
Now let’s talk about the pay scale. Due to the rapid growth in data ecosystem, all data professionals are highly in demand and enjoy much better starting salaries than most other professions. At the time of writing, according to Indeed.com, the average base salary for a Data Analyst in United States is $72,945. A Data Scientist, on the other hand starts at an average of US $121,050, and a Data Engineer at US$130,287.
It is important to keep in mind that these are average starting salaries. There are also performance based bonuses, and, if you develop specialized skills that are in greater demand, you can start much higher. Of course, it also depends on where in the US or in the world you are based. And in each of these fields it is very much possible to acquire greater skills and responsibilities and move up the ladder.
For example a junior Data Scientist, who demonstrates firm technical grasp and works well with others, can move on to being a team leader or even a manager who is responsible for leading a team of Data Scientists. It is also not uncommon to move between various data-related roles. Data Scientist who develops greater proficiency with machine learning and deep learning skills could take on the role of a Machine Learning Engineer (which has an average US starting salary of $149,924).
The chart below summarizes some of the key points covered so far in this article:
Developing the Skills for each Data Role
Now let’s get to how you can develop the skills for these data driven professions. The traditional channel for most higher paying jobs has been to get a college or university degree in a related subject. And while many employers still rely on hiring through these traditional channels, the shortage of professionals in information technology and data related roles has led to employers exploring alternative pools of skilled talent, such as graduates of certificate and diploma programs.
For instance, Tesla is expanding operations in many locations globally. Elon Musk recently tweeted that "Over 10,000 people are needed for Giga Texas just through 2022!" And the job postings include positions for those with Data skills. Elon Musk also tweeted that “You do not have to have a college degree to work for Tesla.”
Which brings us to the work IBM has been doing with its partners like Coursera. The Cognitive Class mission has always been to make in-demand technical skills accessible by everyone. To assist with this mission, IBM has been working with Coursera to release several data related programs geared at equipping learners worldwide with affordable skills for Data Analytics, Data Science, and Data Engineering.
These include multiple self-paced online programs and professional certificates that can be completed within 6-9 months such as the:
All of these programs are suitable for anyone in any part of the world, regardless of whether you have a degree or not. And it does not matter whether you have prior data or programming background, because these programs are designed to equip you with the key skills required for the respective domain. All that is required is a passion to self-learn online and basic computer literacy.
If you are interested in any of these data careers, I highly recommend you review these programs on Coursera and decide if they are the right fit for you. Note that each of these programs cost only US $39 per month and those with a need can qualify for financial aid.