Navigating the Realities of Being A Data Scientist
Ostensibly, it may seem that being a data scientist is all sunshine and rainbows (at least I think that is the perception I give from my posts!). High pay, great benefits, flexible hours, and interesting work are some things that come to mind when thinking about a Data Science job.
Heck, I even wrote a whole article on why data science is a cool job!
While these are all definitely true, every job has some hidden struggles behind the scenes, and a data scientist is no exception. Don't get me wrong, it's a fantastic job, and I absolutely love the field, but not every day is completely glorious.
That's why in this article, I want to dive into several realities or principles you must accept as a data scientist and ones I face regularly. Hopefully, this will help anyone reading this post who wants to be a data scientist to decide if this field is really for them.
Dealing with Constant Change
Unless you have been hiding under a rock, it's no secret that AI took the world by storm last year. Most people would roll their eyes if you mentioned AI, but now ChatGPT is a household name.
The rise of generative AI is just one example of how dynamic data science and Machine Learning is. When I started working as a data scientist in September 2021, there was no text-to-image generation (DALL-E), GitHub Copilot, or ChatGPT. These tools are now the ones I use daily!
GitHub Copilot · Your AI pair programmer
As of writing, OpenAI just released SORA, which now does text-to-video.
The problem is that now you feel obliged to learn this new technology to keep up to date and in the loop with the latest developments. This can be tough, especially if you haven't got much time outside of work hours to up-skill in these areas.
Obviously, the popularity of generative AI is quite an extreme example. However, I can give you one closer to home.
Last year, I carried out a project to migrate one of our models to a different package, as the existing one didn't work on Apple M1 chips. I won't go into the specifics, but the package didn't work because most computers have intel chips, which have a different instruction set architectures to the Apple "M" series. Anyway, halfway through the migration, the existing package released Apple M1 support, rendering all my work redundant.
As annoying as that was, it does show that data science and technology change rapidly. You'll never truly be on top of everything. Even package updates!
You can't "complete" data science. There are so many topics that have weekly breakthroughs. Therefore, it is impossible to learn everything, no matter how hard you try!
Depending on who you are, this is either scary or fascinating. I love learning, and data science is a field where you learn for your whole life. This blog serves as an example of that, and I am only just scratching the surface.
What most people recommend, is to have so-called "T-shaped" skills or knowledge. This is where you have a shallow understanding of areas in or around your field to a good enough level that you can collaborate with others, but real depth in understanding and specialty in a couple of areas.
Most senior or principal data scientists I know have T-shaped knowledge. They are typically an expert in areas like optimization, forecasting, or recommendation systems, but also have a good understanding of general machine learning and software engineering principles. This is the approach I am taking, but still need to pick my specialism!
Check this Forbes article if you want to learn more:
Continual Imposter Syndrome
With this constant change in the field, imposter syndrome is REAL for data scientists. I often get this feeling at least once every two weeks. I obviously can't comment on other practitioners, but from anecdotal conversations, others often get a similar feeling from time to time.
If you are unfamiliar with what imposter syndrome is, Wikipedia explains it:
Impostor syndrome is a psychological occurrence in which people doubt their skills, talents, or accomplishments and have a persistent internalised fear of being exposed as frauds.
By opening up LinkedIn or Medium, you are immediately bombarded with a deep reinforcement learning algorithm someone is building to predict the weather or a generative AI forecasting algorithm to measure stock market returns. These are hypothetical examples, but they stem from some truth.
It's hard not to look at these projects and think these people are ahead of you. It often feels like people are creating things all the time, which can make you feel like a failure if you are not at the cutting edge of most domains.
It's important to realize that this is not the case and that no one truly knows everything. One data scientist may be a specialist in deep learning but doesn't know much about combinatorial optimization. People often talk about the topics they know most, and less about the ones they know least about. This is just human nature.
For example, in my current job, the data scientists are split into cross-functional teams by domain. My team is forecasting, so naturally, I am developing expertise in this area. Every week, there is a data science meeting where the data scientists present some of their work. I can't begin to explain the imposter syndrome I feel during these presentations. I start thinking "These people are so much better than me", "How am I here" etc. But like I just said, they are experts in this area and work on it every day. People probably have the same feeling when I present on forecasting.
The main point is to be prepared to feel like you don't know enough, and maybe even a failure sometimes. This is ok, as many other data scientists feel the same way from time to time.
Imposter syndrome can be detrimental to your mental health, if you want some actionable tips on overcoming imposter syndrome in data science, then the LinkedIn thread linked below has some great tips!
Ambiguous Job Definition
Data science is still quite a loose term and no one is exactly sure what it means or the work it encompasses. This leads to inconsistent job advertisements and makes it difficult for others to comprehend what you actually do.
It is crystal clear what your job is if you are a divorce lawyer. Likewise, if you are a doctor or dentist. However, this couldn't be further different for the data science field.
A data scientist at one company may be doing completely different work than a data scientist in another company. Some roles use machine learning regularly, whereas others are more analytical, and you may even find yourself doing some software engineering.
For example, even though I am a data scientist who works primarily on machine learning projects, many recruiter messages I receive are jobs for a data scientist in analytics. Sure the title is the same, but the role is very different.
It also depends on the structure of the company and the people they have. If you are the only data professional, then be prepared to do almost everything from data engineering to analytics. Some companies want to have a "data scientist" as it looks cool and is on trend.
From experience, many people see a data scientist as just another version of a software engineer or just that "tech" guy, who can do everything. This is not the case all the time of course.
This ambiguity leads to people not fully understanding what you do and not utilizing your skills effectively. Like most things, this can be good and bad.
The good:
- You learn a wide range of skills (data science, engineering, analytics, etc.) as people are not too sure what tasks to give you.
- Define your own unique role at a company.
- This may lead to more responsibility as you are doing all things data.
The bad:
- No specialist skillset as you are doing too many things and wearing many hats.
- It might be harder to transition to other jobs as your current skills don't meet their expectations.
- Lack of mentorship and risk of being mismanaged if no other data professional is at your company.
This can make it a bit of a minefield in navigating the data landscape, and you need to be thorough when reading job descriptions and explaining your role to other employees in your current organization.
There have even been journal papers written about the ambiguity of the data science profession and its role in the organization. This truly shows how big this issue is.
However, I feel over time this will become less of an issue as the title and skills will become more standardized. We may even see the rise of other professions and titles from the current data scientist role.
High Entry Requirements
The fundamental skills for a data scientist are maths, statistics, and programming. Now, on their own, these skills are hard to learn, there are literally full degrees and masters programs for these subjects. However, as a data scientist, you are expected to know these areas to a pretty high standard.
In other jobs, excelling in one of these areas would be your standout strength, but in the data science and machine learning field, this is the baseline most practitioners have.
According to this article by 365DataScience 74% of current data scientists have a master's or PhD, which are postgraduate degrees. I can relate to this, as when I was applying for data science roles just out of university several candidates had PhDs in stem subjects. Even though I had a first-class master's in Physics, I still felt inferior.
This sets the bar very high for entry, so everyone in the field has these fundamental abilities, therefore it is quite hard to stand out from other data scientists. You have to be more dynamic and think outside the box to become an outlier.
Some ideas to stand out are:
- Become a great specialist in a certain domain such as forecasting, recommendation systems, or reinforcement learning.
- Developing excellent soft skills so that you can articulate complex algorithms. This is particularly useful if your stakeholders are non-technical.
- Have terrific business domain knowledge. For example, If you work in a bank, improve your finance knowledge through taking some certifications.
- Have a personal brand to show demonstrable interest in the field.
- Be a great teacher, so you can mentor and guide junior colleagues.
These extra skills take time to develop and may require you to work outside normal hours to really master them. For some people, this may not be ideal or possible, but putting in extra effort is how you get ahead of others.
I have a previous article where I explain some key ways to make your data science application stand out:
If you want to learn the required skills for data science and machine learning, I have a previous post detailing a roadmap you can follow!
Summary & Further Thoughts
Being a data scientist is a great job, but it doesn't come without its challenges. This post is not meant to put you off from a career in data science or machine learning, but more to be transparent with some of the harsh realities I face and you may face in your career. I hope this article helped you learn more about the field and whether it is a right fit for you!
Another Thing!
I have a free newsletter, Dishing the Data, where I share weekly tips for becoming a better Data Scientist. There is no "fluff" or "clickbait," just pure actionable insights from a practicing Data Scientist.