5 Habits Senior Data Scientists Use to Boost Their Productivity

Author:Murphy | View: 25120 | Time: 2025-03-22 23:01:58

You're here because you're a data scientist looking to take your game to the next level.

It's not technical or soft skills. No, you already know all about that.

What you're looking for is a special upgrade. You know, like when Mario goes wild high from that mushroom juice.

Well, guess what? I've got just the thing for you!!

I've started my Data Science career at Spotify as a mini Mario playing in the league of the big data science Super Luigis praying to God you get the ref.

And working closely with them made me realize one thing:

Success as a data scientist is greatly determined by your ability to carve out an environment conducive to success.

It's the key power-up that'll make you go vroom to your Super self!

And it's your lucky day today because I'm about to share with you five valuable practices I learned from senior data scientists at Spotify.

These habits have supercharged my Productivity on data science projects, and they will do the same for you. More specifically, they will help you:

Improve your efficiency, whether you're starting your career or looking to improve your work ethic.
Better allocate your most precious resources – energy and time.
Derive meaningful learnings from your projects to fast-track your way through future ones.

To do this properly, you're going to imagine that you've just been assigned an important super task: your newest data science project.

Your hypothetical mission is to conduct foundational research to assess the key factors behind growth in the product area that you're working on.

To make this as realistic as possible, don't forget the fact that you're also juggling this project alongside at least 1 or 2 others on the side.

I'll walk you through these tips in a way that touches on the entire lifecycle of your project so that you understand how to apply these 5 habits in your next real one.

We'll make sure that you're being efficient in every step of the process, avoiding wasting time, and ensuring your work is even reusable later!

Step #0. Craft a Dynamic Knowledge Document

Any good project starts with a deep understanding of the business objectives.

The best data scientists are the ones who can effectively reconcile business acumen with technical craft. After all, being a data scientist isn't about aligning two lines of code on a notebook or importing ML libraries.

It goes beyond.

A data scientist is just like Super Mario adapting to different environments, whether by throwing fireballs or shrinking to fit into tight tunnels.

Business understanding aka domain expertise is the weapon you should keep with you at all times.

Why it's so important

Picture this – I recently spent months working on one of the car features of the Spotify app. I got pretty overwhelmed running A/B tests on the thing because my knowledge of the feature was still in the crib.

I couldn't grasp the underlying subtleties related to the user experience because I lacked context awareness. I didn't know how users experienced the feature so I struggled to identify the best metrics to measure the performance of the upgrade.

Without the help of my fellow data science coworker – who had already spent months building a business understanding of the feature – I was just clueless (and useless).

Domain expertise **** represents your understanding of the specific field or product in which data is being analyzed.

It's the most foundational data science skill ever.

Knowing statistics, ML, or Python will take you nowhere if you fail to grasp the context in which these tools need to be applied.

After all, these are only but tools in the service of an overarching business goal, of which you need proper understanding if you hope to succeed.

If you don't understand how the product works, you can't make sense of the data no matter how fancy your tools are.

It's crucial for: → Picking the right metrics→ Interpreting data accurately → Making relevant business decisions → Formulating impactful recommendations.

But it's not something that falls upon you overnight.

It's a continuous process that takes time to build up from the first day you land in the company.

How you can do it yourself

One of the ways I was able to get up to speed with the car feature was by putting together a dynamic knowledge document.

This practice is crucial for documenting and tracking evolving ideas.

Not just in the beginning for storing new information but also for maintaining that expertise over time.

It also ensures you don't go back and forth nagging the same person to spit out the same information back to you (I've been a victim of those people myself sigh).

Here's a list of things that you should include in this document

Product Basics 101—When reading through past research, make sure you document everything about how the product functions, who the end-users are, and their journey while interacting with the product. This type of information is useful for all future projects.
Learnings – To make this document as useful as possible, make sure to keep track of everything you learn through your exchanges with relevant stakeholders like product managers, engineers, and designers.
Sources – Sometimes stakeholders might ask you the origin of some key numbers. Make sure to include direct links next to every piece of knowledge you put down to be trackable for later reference.
Personal Ideas & Hypotheses – Don't forget that being a data scientist is a creative journey. Whenever an idea pops up that can be worth exploring or experimenting on later, make sure it's in the doc.
Big Questions – You need to truly understand your product, and ask yourself the big questions. For instance, product managers often say things like "we want to increase consumption" but sometimes, that's a short-term leading indicator. What do you think could be the long-term final goal? Write that down.

I know it's an investment, especially if you're a sloth like me, but trust me, it's worth the effort. It's your Pandora's box!

This habit boosts productivity because it equips you with a constantly updated knowledge base that will help you make informed decisions quickly.

Step #1. QA Your Code

So now you've been assigned this research where you get to go full-on investigator mode. You also have an extensive knowledge base to guide your detective work.

Let's assume you've already put together your research plan, with all the steps and questions to explore.

You've started the process of writing those SQL queries to fetch the data you'll be analyzing. But before moving into this, you first need to check if those queries are legit.

Believe me, even the pros I work with at Spotify never skip this fundamental step of the process: Q-Aing their code (quality-checking).

Why this step is so crucial

Whether you're doing fundamental research like in our scenario or creating metrics for your A/B tests (or anything else for that matter), the data extraction step of the pipeline is one of the most meaningful ones.

If you mess up even the smallest thing, all the insights and analysis ensuing from it might as well be trash.

Querying the data correctly is a non-negotiable, you can't afford to mess anything up.

Even if you think you know everything about the data you're handling, that itself isn't enough.

Data scientists usually spend a long time building up queries. We might as well drown in them without realizing it. So we're always at the risk of:

Forgetting some key elements
Writing suboptimal code
Misunderstanding the data documentation

So do your due diligence and get those numbers checked.

It's your responsibility to make sure you're delivering accurate and reliable data (and if you don't, just know it'll be your butt they'll be coming after when they find out about it).

How you can do it yourself

Step #1. Double-check things yourself

This too, is a non-negotiable.

You must always review your queries. Something I learned from my senior peers is to break things down into smaller steps.

It reminds you of what you hope to answer with the data and tick things off as you go, but you need to simplify the process for it to be efficient, e.g.:

For every user in your data, you want to measure one specific metric, let's say "consumption" for the sake of our business case. How many minutes have they played over the previous 7 days for each day of the period you're measuring this on?
Choose just one user from the population of interest and go through the steps of computing that metric with just that one user. Check as you go.
Keep in mind the end goal. It becomes your reference point for when you get stuck. Don't hesitate to break things into smaller tasks if needed.
When something looks off in the data, compute it manually and check if the values are different than what you were getting initially.
Don't forget to always double-check with the engineers your understanding of what the data is about. Sometimes that stuff can be more obscure than your street sewage.
Finally, one quick win is to pull up the descriptive statistics card. Just plot the statistics of all columns; a df.describe() should do the trick with pandas. You also have tools like data profiling, which are also great!

Step #2. Peer review

If you're unsure about the reliability of your query or intend to use it as the foundation for in-depth research, it might be a good idea to have someone else review your code with fresh eyes, as you may not have that perspective.

That's when you want to go to one of the data scientists in your team and ask them to review your code. This way, they'll be able to:

Point out any inefficiency in your code.
Identify mistakes you may have overlooked.
Help you troubleshoot things if you're stuck.

The key here is to make sure you properly prepare the grounds for the review by giving context to your peer. Explain what is important. Guide and respect the time of the reviewer. Make the review doable and efficient.

This habit boosts productivity because you ensure the accuracy and reliability of your data analysis from the start.

Quality-checking your code helps you catch errors early, refine your queries, and build your work on a solid foundation. This way it'll prevent you from making costly mistakes and inefficiencies later on.

It happens to the best of us, and more than you may think!

Step #2. Store Your Queries!!

Think of this like storing extra lives in Mario's world.

It took me 2 data science degrees and 2 internships in Tech to start storing my queries in a proper place (not talking about notebooks).

Why it's a non-negotiable

You need to store the queries you use in the data extraction part of your project for 3 reasons:

When you're weeks or months into your project, there might come a time when you may have to go back and double-check the rationale behind how you built your metrics.
You may also want to add more features or even change the timeframe.
Finally, those queries will be needed by you or one of your peers for other projects in the future (talking from experience).

You also need to store the queries that you write for the most common questions or metrics.

I can't even count the number of times I wrote the same old queries over and over again – from scratch – because I was too lazy to just store them somewhere.

So do your future self a favor and store your queries!

How you can do it yourself

Here's one of the approaches I picked up from my senior peers at Spotify:

Create a repository on your enterprise Github for your project or your ad-hoc daily work.
Create folders with specific and easily understandable names for you or your peers.
Paste your queries in a .sql file in the appropriate folder.
Make sure it's well-documented with comments. Trust me, you won't remember why you wrote your code the way you did months after it's over. It'll feel practically brand new.
Push the files.

And if your company doesn't use GitHub, leave the company. No, I'm just kidding. Just create a folder on your computer and follow the same process as this one, but locally.

This habit boosts productivity because saving your queries enables easy revisits, refinements, and reuse for future projects. It will ensure efficiency and continuity in your work over time. You're also investing in your future self's peace of mind.

Step #3. Dealing With Stakeholder Requests

Hey, just because you think you're on a high-priority mission to unearth those highly precious insights, doesn't mean stakeholders will leave you be.

Nope. They're most likely gonna come after you with all their might asking you completely unrelated questions or for data they need to do their job. And when they do, you'll have to be able to deliver on all fronts.

Why we're even talking about this

Because these requests are just not going anywhere.

Even if you run away, they'll still be there waiting for you like your crazy ex when you get back. So better learn how to handle them alongside other tasks early on (not talking about the ex).

I have to admit I still struggle with this as even today because context-switching can be draining at times.

But I want to improve.

It just happens that many data scientists struggle with the demands of the unyielding context-switching monsters too, so we're not alone.

Research has shown that we are incapable of multitasking and it's one of the biggest productivity killers (Ikigai, Hector Garcia & Francesc Miralles).

But what can you do when it's beyond your control?

You learn how to manage it.

How you can handle it

Managing these requests is like going through different levels in Super Mario but at the same time.

Scary right? Well, it doesn't have to be.

My team at Spotify has been invested in fine-tuning our approach to tackling stakeholder requests efficiently. Having stakeholders follow specific guidelines has proven immensely helpful.

It might be trickier with a wider crowd, but it's still useful even if it's only used by heavy users!

At Spotify, we use Slack channels to interact with everything. And so, we've pinned instructions directly on my team's data science channel for stakeholders to follow. Those instructions ask them to provide us with:

The context around the problem – This helps redirect the request to the right data scientist instead of being sought out when we're not the right person for it.
The impact having this information will have – This helps assess whether the request is worth following through. Data scientists' time should be allocated to where they can have the most impact. It enables us to filter out low-impact tasks that require high resources from us.
The urgency, or any specific dates – This helps assess whether it's a request that needs our attention now or whether it can be put away for later. It helps minimize the context switching based on priority level.

So teach your stakeholders how to inform you of their needs.

Finally, it helps to have a system of taking turns in the data science team to handle those requests, e.g. on a weekly basis. This allows whoever is not the goalie of the week to focus directly on their high-impact projects.

This habit boosts productivity by teaching you to streamline stakeholder requests with clear guidelines and a rotation system. This way you'll be able to focus on high-impact tasks and minimize disruptive context switching.

Step #4. Conducting a Post-Mortem or a Retro

Warning: sensitive souls please refrain from checking Google images.

I'll spare you the trauma.

What we're talking about

Historically, post-mortem means an examination of a body to determine the cause of death, an autopsy.

But don't worry, we're not in the business of the macabre here, just the business world.

So in our context, you perform a post-mortem to go over your failures if the project wasn't a big success. It's an opportunity to analyze what went wrong and how you could do better in the future.

But if the project went well, then a retrospective is what you want to be doing to reflect on both your successes and your failures.

Why reflection should become a habit

Whether the experience was positive or negative, there's always a lesson to be learned from it.

Record what happened in as much detail as possible.

This gives you a chance to analyze what worked and what didn't, so you can improve when the same or similar situation comes around next time.

How you can do it more efficiently

This can usually happen in the form of a team meeting with all the involved stakeholders. The goal is to touch on the following points:

What went well— Things like great engagement and collaboration, satisfaction with the results, smooth leadership, etc.
What did not go well – For instance communication issues, failure to deliver on time, lack of transparency, technical blockers, an unclear timeline, etc.
Learnings – Focusing on what each team member learned from the experience is a great idea to reflect on the whole experience from an individual standpoint.
Action points – Next steps based on the learnings derived from what you should keep or stop doing and what you should improve.

Of course, I encourage you to go beyond these or adapt the template to fit specific projects.

Recently, we conducted a retrospective on the dashboards that we spent months building in Tableau last year, focusing on assessing their current performance.

The retro touched on points such as:

Confidence – Were we satisfied with the quality of our dashboards?
Ownership – Who will maintain the dashboards and data pipelines?
Data documentation – How can we make the data pipelines more readable and maintainable?
Dashboard documentation – What was the logic in how we built the dashboards? Is the process understandable by peers?
Disclaimers – How to make our stakeholders aware of anomalies? How to maintain the information?
Performance – Is performance likely to decrease drastically when adding deeper historical data?

How you conduct the retro is up to you. What matters is that you take the time to conduct one.

This way you make sure this experience is something you can grow from regardless of the project's performance.

At Spotify, we enjoy using the following tools to reflect on our projects and collaboration, and I encourage you to use them too as you see fit:

Figma
Retrotool
The good old Google Doc or Microsoft Word (depending on the team you're in)

This habit boosts productivity by encouraging you to analyze both successes and failures to continuously improve.

When you reflect on what went well and what didn't, you learn how to identify actionable steps for future projects, and ensure growth and efficiency.

The habit that conquers them all

The power of flow

Learn how to reach your state of flow.

When we flow we are completely immersed in a specific task without any distraction. It's the ultimate way to find focus and stay in it.

Getting in a state of flow is the ultimate productivity skill that ensures quality work and efficient management of your time and energy.

Continuous flow interruptions can be mentally draining because you have to constantly pick up the context from where you left it.

How to achieve flow?

According to researcher Owen Schaffer, seven requirements need to be fulfilled to reach a state of flow:

Knowing what to do
Knowing how to do it
Knowing how well you are doing
knowing where to go (if navigation is involved)
Perceiving significant challenges
Perceiving significant skills
Being free from distractions

Be mindful of your time

Tools like Slack can be dangerous, so you need to be mindful of your time.

How?

By being proactive in planning focus days or slots in your schedule to enter the zone.

You can also use techniques like Pomodoro to properly manage your focus time.

Quick Recap

Just like Mario, we're all swifting through the different realms of the data science world. And just like Mario's kingdom, all power-ups come in handy regardless of the realm in which he's doing his jumping jacks.

So here's a quick reminder of the boosters that can supercharge your productivity and propel you toward your Super self.

Craft a Dynamic Knowledge Document: Like Mario's versatile power-ups, equip yourself with domain expertise. Remember to continuously update this document to keep your knowledge fresh and relevant.
QA Your Code: Quality-checking your code is as crucial as Mario checking for hidden traps. Break down your queries, double-check with peers, and ensure accuracy to avoid any pitfalls.
Store Your Queries: Think of this like storing extra lives in Mario's world. Save your queries in an organized manner, so you're always prepared for future challenges without having to start from scratch.
Dealing with Stakeholder Requests: Managing these requests is like going through different levels in the game but at the same time. Use clear communication and set up a system to handle these efficiently without losing focus on your main quest.
Conduct a Post-Mortem or a Retro: After completing your project, take time to reflect. Analyze what worked and what didn't, much like reviewing a game replay, to continually improve your skills and strategies.

Learn how to flow.