The Difference Between ML Engineers and Data Scientists

Author:Murphy | View: 30005 | Time: 2025-03-22 19:31:19

A new role that has popped up in the tech space over the past few years is the machine learning engineer (MLE). Some people often confuse MLE with a data scientist; however, there is quite a wide distinction, which I will explain in this article to give you some clarity if you are thinking of making the switch!

Data Scientist

Let's start by defining a data scientist. Having worked as one for over three years, I feel I am adequately placed to do so!

Data Science is a broad term nowadays, and it means different things at different companies; it's ambiguous, to say the least! (I often complain about this)

A data scientist at one company could be doing purely analytical work and setting metrics, whereas, at another company, they could be building and deploying machine learning models.

This is why I always recommend prospective applicants thoroughly read the job description to ensure they understand what they are signing up for and is inline with what they want to do.

Generally, a data scientist will do a mix of analytical and modelling work and be relatively close to the business side. This would involve finding opportunities by speaking to stakeholders and senior managers, and scoping potential projects.

I like to think of data scientists as the "linchpin" between the business and tech side.

From the diagram below, a data scientist will generally do the first four sections.

Problem – Converting the business problem into a modelling/algorithm approach. This involves finding opportunities by speaking to stakeholders, managers, etc.
Data — Making sure we have the data available for the problem. Can't solve the problem if the data doesn't exist!
EDA – Exploring the data to see any patterns or interesting insights. This can sometimes be done by a data analyst at specific companies.
PoC Model — Build an initial model to measure potential value. This is usually done in a notebook environment or very roughly in a modelling repo.

Something missing from the diagram is the communication and presentations you will need to make to stakeholders throughout this process. Data scientists frequently present their results to many non-tech people and keep them up-to-date with all the latest happenings.

Again, the lines may be blurred in some companies, but typically, the larger the company, especially tech companies, the more established these roles are.

In startups, you can also expect to be a "full-stack data scientist", whereby you do pretty much all the tech stuff for the company, like web development, data analysis, model building and deployment. Here is a great article explaining this role if you are interested.

I have linked a few articles below detailing more information about what being a data scientist is like.

Common Misconceptions About Data Science

Behind The Scenes: Explaining My Work As A Data Scientist

Navigating the Realities of Being A Data Scientist

Machine Learning Engineer

Data scientists and Machine Learning engineers work very closely and even overlap in certain areas, however the main distinction is that MLEs are responsible for model deployment and monitoring.

I have seen it in industry where someone would build a model in a Jupyter Notebook or in some PoC state. The model would be very good, but the problem is that it is utterly useless to the business as there is no way it can effectively make real-time decisions i.e. its not in production.

This is exactly where MLEs come in. They help bring the models "to life" and ensure they generate business value. To do this, they are often well-versed in software engineering best practices and principles and the machine learning and modelling side.

MLEs normally own the last three stages for the below diagram:

PoC Model — Build an initial model to measure potential value.
Model Deployment – Turn the PoC model into productionisable code and deploy. This involves writing proper production code and using something AWS to make it live.
Model Maintenance—Monitor the model to ensure it is doing what is expected. MLEs would frequently be "on-call" to fix anything that may break in production.

The overlap with data scientists often comes in the PoC model stage as shown above as both have quite good knowledge on building models.

These steps can be broken down further into areas like:

Model optimisation — this can either be algorithmic or runtime performance.
Deciding the best deployment strategy – what architecture is behind the deployment, which cloud provider to use, etc.
Model testing – building unit tests, CI/CD pipelines and live testing using A/B or shadow systems.
Containerisation – using stuff like docker or kubernetes to ensure the model works across various machines.

This is not to say that MLEs don't conduct model research and look for ways to improve the model's accuracy and performance. Because in many companies, this would be part of the workflow. However, they are more focussed on getting the model working for the business most efficiently.

Another essential thing to note is that MLE roles are generally more challenging to get for several reasons.

There are fewer positions because only established tech companies offer or have MLE roles.
You usually need a couple or a few years of experience as a data scientist or software engineer before you transition. It's not really an entry-level job.
You need to be well-versed in machine learning and also software engineering. These two domains are whole jobs within themselves. Naturally, you will be better at one or the other, but you should be competent at both.

Key Differences

Below is a table showing the key skills for each job.

The below table shows the key technologies for each job.

These are not hard and fast rules. The whole data and ML space is relatively new, so the distinction between roles varies greatly.

Again, in some companies, you may use certain technologies as a data scientist that an MLE would use or vice versa. The tables are just a general guideline

Summary & Further Thoughts

If you are stuck between being a data scientist or a machine learning engineer, this article gave you more clarity. The critical thing to remember is that a machine learning engineer is more about model deployment and software engineering, whereas data scientists do more analysis and initial model development.

Let me know which one you would rather be!

Another Thing!

I have a free newsletter, Dishing the Data, where I share weekly tips and advice as a practising data scientist. Plus, when you subscribe, you will get my FREE data science resume and short PDF version of my AI roadmap!