How can I become a Data Engineer?

In recent months, I have seen an increasing interest in career shifts to data engineering. This is not surprising given that there is a severe shortage of data engineers. Data engineering represents a tremendous opportunity for data people looking for career growth. With this trend in mind, I interviewed a well-known data management expert to get a perspective on what it takes to become a successful data engineer.

I spoke with Dave Wells who directs the Data Management Practice at Eckerson Group ( to understand what it really takes to be a successful in this field.

Jennifer Hay: What type of skills and qualities are needed to be successful as a Data Engineer?

Dave Wells: Data Engineering combines software engineering and database engineering skills to design, build, deploy, and maintain databases, data pipelines, and data services. There are a range of skills that are essential to becoming a data engineer – architectural, modeling, programming, and technologies. This role also requires a clear understanding of core data engineering processes that include pipeline design and development, data services design and development, and database engineering.

Data Engineers are technical designers and developers, much like Software Engineers, but with different objectives and challenges. Due to the scope and complexity of data types, databases, and data management technologies, the field encompasses a wide range of disciplines from architecture to coding and testing. Understanding the scope and complexity of data engineering is essential both for data engineers and for those who work with them.

Jennifer Hay: When someone is thinking about transitioning from software engineering to data engineering, what should they know about the differences and similarities?

Dave Wells: The two roles are similar in that they are both highly technical disciplines to design and build things. They are different in the things that they build. Software Engineers build applications. Data Engineers build databases, data pipelines, and data services. Both require programming skills with some overlapping technologies and some technologies that are unique to each role.

Jennifer Hay: In terms of deliverables, how are data engineering and software engineering responsibilities intertwined?

Dave Wells: There are three primary areas where data engineers work, and two of them are closely related to software engineering.

Data pipeline development depends on the software engineering skills of the data engineer. Data pipelines are made up of software components— operations that manipulate and transform data that are organized sequentially as work flow and data flow.

Database development depends on database engineering skills for database modeling, design, specification, and construction. Traditional relational data modeling techniques work well for RDBMS databases. For NoSQL data and unstructured data, new modeling techniques are emerging.

Data services development uses software engineering skills. Whether using RPC, SOAP, REST or other protocols the functions and operations of a data service are software that is created by coding in a programming language.

Jennifer Hay: Within data-driven and analytics focused organizations, both data engineering and data science skills are in high demand. Can you describe the differences and similarities between these two disciplines?

Dave Wells: Data engineer and data scientist are distinctly different roles yet there is some overlap when it comes to skills and responsibilities. The main difference is in the deliverables—the things that they build. Data engineers build processes to create the databases and datasets used by data scientists . In contrast, data scientists use statistical and algorithmic skills to build analytic models.

Become a Data Engineer

Become a Data Engineer

In smaller organizations these roles are sometimes performed by the same individual. Data engineers organize and optimize data for a variety of analytic use cases. Data scientists work to find the meaning in data with attention to discovery, diagnosis, and prediction. While engineers and scientists have some overlap of skill areas, the real differences are determined primarily by the unique skills of each role. I use a Venn Diagram to illustrate the differences and the overlaps.

Your readers can also access a Data Engineering Skills Assessment and Gap Analysis tool that I created. The instructions are straight forward. Just enter information about your experience and education and your organization’s level of need for each skill. The tool will calculate skill level and skills gap information. High skill areas show your strengths. Wide gaps show the areas to pursue more education and experience. Narrow gaps show the areas where you’ll make greatest contributions to a data engineering team.

Jennifer Hay: As a technical resume writer for Data Engineers, Data Scientists, and Software Engineers, I see tremendous value from taking your assessment. There are a variety of ways that it could be used for my clients, as well as for others who want to transition to data-related careers. For example, to make decisions about their career path and to develop a training plan to meet those goals.

Beyond that, it could be used in a resume to describe a person’s data capabilities. The goal of a technical resume is to show your current experience, as well as to demonstrate how you can transition to different roles and responsibilities. It could provide a clear message that the individual understands the data analytics space, along with their willingness to take a deep dive into learning more. In a Data Engineer or Data Scientist resume, it is all about the messaging.

Dave Wells: I think self-assessment is always a healthy thing. The field of data management is evolving at a rapid pace and staying current is important to career growth. Today it isn’t enough to demonstrate your knowledge for a particular role. You also need to show working knowledge and understanding of the roles with which you collaborate. If your goal is Data Engineering you need to demonstrate some understanding of what Data Scientists do. If your goal is Data Science you need to demonstrate understanding of what Data Engineers do. If you apply for a Data Engineering job it is reasonable to expect that Data Scientists will be among the interviewers. Let your resume be the set up for that interview discussion.

Jennifer Hay: As always, it is pleasure to hear your perspective about data and information careers. It is an exciting time to work in Business Analytics and Data Analytics. I am already working on an article about how to be a successful Data Scientist and hope that you will join me in that discussion.


I originally wrote about data scientist careers several years ago and although the role has since evolved and is now more clearly defined, many of the same skills are transferrable. I’ll use that article ( as the foundation for my next exploration into what it takes to be a very successful Data Scientist.