Strategic Data Analysis (Part 1)

Author:Murphy  |  View: 22499  |  Time: 2025-03-23 12:28:43

This is part of a series on Strategic Data Analysis.

→ Strategic Data Analysis (Part 1) Strategic Data Analysis (Part 2): Descriptive Questions Strategic Data Analysis (Part 3): Diagnostic Questions (Part 3) Strategic Data Analysis (Part 4): Predictive Questions ← Coming soon! Strategic Data Analysis (Part 5): Prescriptive Questions ← Coming soon!


In my 10 years of tenure in working with data in any capacity, I've noticed how much focus there is in learning quantitative techniques in order to do data analysis. I have spent thousands of hours perfecting my knowledge of everything from statistics to machine learning to economics and beyond. However, I found very little guidance in the strategic approach to answering business questions by using data analysis. I have also encountered many junior analysts who frequently mistake data analysis for its quantitative techniques, disregarding the fact that analysis is a powerful way of thinking and a great problem solving tool – i.e. data analysis is not just a product of its methods.

In this multi-part series, I hope to do a data analysis primer that will provide a structured approach to using analysis in order to answer business questions. In Part 1, I will introduce data analysis and the four types of questions that it can help answer. This can be used as guidance for identifying analysis questions correctly. In the following posts, I will propose a strategy for answering each type of question and a methodology for selecting the correct techniques. I hope you find this guide useful – let me know in the comments!

What is Data Analysis?

So what is data analysis and what is it trying to achieve? In general, analysis is a process of understanding some complex information by breaking it down into smaller and simpler pieces and understanding those pieces first. This process is used to help solve problems or answer questions. As in the general case, data analysis is a process of understanding something about complex data by trying to understand more manageable information about it.

Analysts can perform an array of techniques in order to do data analysis. For instance, if we are working with a medical facility manager and they ask us to describe their typical patients, we use statistical methods like taking the mean or calculating the range to describe the patient population. Thus, we can understand all of the clinic's patients with just a few simpler statistics that describe them in aggregate. The question requires us to understand data which is complex in size and we can do so by understanding something less complex about it.

Data analysis is "a process and practice of analyzing data to answer questions, extract insights, and identify trends" [1]. However, although data analysis requires techniques borrowed from statistics, machine learning, mathematics, and other disciplines, data analysts are not statisticians, data scientists, or mathematicians. While data scientists should understand a lot about the topic they are working with, they don't have to be professional experts in that topic. The goal for data analysts is to be familiar enough with various techniques and be experts in applying them properly in order to generate insights and recommendations and enable business partners to make better, data-informed decisions. But you don't have to be a Data Analyst to do data analysis and anyone familiar with the quantitative techniques and data analysis strategies can use them to help deliver data-informed decisions.

Nearly all of the questions that require data analysis fall into four main categories: descriptive, diagnostic, predictive, and prescriptive. Some questions pertain to known values and variables (like descriptive and diagnostic questions); some questions are more hypothetical than concrete (like diagnostic and prescriptive questions). Answering these questions requires critical thinking, creative problem solving, and logical reasoning. However, if we are able to categorize a question that requires data analysis, we can develop a strategy for answering that question, based on its category. Therefore, it is necessary to be familiar with the types of questions and with the strategies on how to approach them.

The rest of this article introduces each of the four question types, describes them and provides examples to help us identify each type.

Descriptive Questions

Descriptive questions aim to acquire an understanding of something concrete. This can include description of a population, relationship between different variables, or various trends. These types of questions are generally the easiest to identify – they typically refer to the present state or the past and generally start with a "what" or "is/does/did" keywords. Since not all descriptive questions begin with those keywords, another way to identify descriptive questions is to check if the question keyword can be rephrased to start with "what". Some examples of these questions include:

  • What were our sales during the second quarter of this year? [2]
  • Did our revenue increase since last quarter?
  • How did our revenue change this year?
  • How frequently do clients cancel their subscriptions?
  • Do the trains run late?
  • Is there any gender bias in our clinical patient care?
  • Tourists from which city tend to stay at our hotel longer?
  • How did the temperature vary last month?
  • Are temperature of air and temperature of sea water related?
  • Was there any change in hold time after we hired more call center representatives?

The questions above all pertain to some known variables which are accessible for the analysis— records of gender in the clinic, temperature records, or yearly revenue. As mentioned before, all of these questions can be restated to start with "what" or "is": "are temperature of air and temperature of sea water related?" is the same question as "is there a relationship between air and sea water temperature?" and "how frequently do clients cancel their subscriptions?" is the same as "what is the frequency of client subscription cancelations?".

Diagnostic Questions

Diagnostic questions aim to understand why something has happened or how something came about and attempt to assess dependence between variables. These questions lead with "why" and its synonym keywords ("how come", "what caused", etc) and refer to an event that already took place or is taking place currently.

The key about diagnostic questions is that they require analysts to come up with potential reasons and verify if those reasons are correct. This is quite intuitive and is how most people try to diagnose a root cause of something. Typically, the dependent variable in question has changed and we want to know why. We can also think about diagnostic questions as "cause and effect" questions where the "cause" is unknown. Some examples of diagnostic questions are:

  • Why does one customer segment engage with us more than other customer segments?
  • Why did our sales dip this quarter?
  • What caused the heat wave?
  • Why are our clients canceling their subscriptions?
  • Why are the trains running late?
  • How come some patients end up in ICU?

In diagnostic questions, the unknown is the cause of the effect. If we are able to identify a known effect and an uknown cause, we are probably working with a diagnostic question.

Predictive Questions

Predictive questions aim to identify unknown values in a known or unknown variable. The values we want to predict may pertain to variables that are partially known and entirely unknown. For example, in predicting future sales, the "sales" variable is partially known (we have values for current or past sales); in client segmentation, "client segments" is a variable that is entirely unknown and we must rely on other features or information to imply values for a new variable.

Decision makers often ask predictive questions in order to make strategic bets and decisions or to assess their preparedness for the future state. Predictive questions are typically posed to look for unknown information but unlike descriptive questions, the answer is always uncertain. Some examples of predictive questions are:

  • What will our sales be next quarter?
  • How many guests is our hotel expecting over the next 90 days?
  • How many likes will our Instagram post get?
  • How likely is our client to give us a five star rating on Yelp?
  • Will there be a lot of snow this winter?
  • How can we group household plants according to their physical characteristics?
  • How will the population of humpback whales change in the future?
  • Will the trains continue to run late?

As noted, predictive questions are not just trying to foresee the future. They are dealing with something partially or entirely unknown. The question "how can we group household plants according to their physical characteristics?" has nothing to do with future tense but wants to solve for an unkown parameter of household plants. The question "how many likes will our Instagram post get?" most likely pertains to a partially unknown variable: we probably have information regarding the number of likes our other instagram posts received but the number of likes this specific post will receive is unknown.

Prescriptive Questions

Prescriptive questions aim to predict what will happen given that a specific decision is made [3]. In that sense, the decision maker asking the question wants to acquire a recommendation based on a set of predictive outcomes. Generally, these questions are phrased in one of two ways: "what will happen if…" or "what should be done so that…".

Prescriptive questions take predictive questions one step further by assessing how a change in current situation will lead to a specific outcome or by identifying an optimal change to current situation that will lead to the best outcome. Like in answering predictive questions, our results will never be certain and have some uncertainty associated with them. However, the answers can help with data-informed decision making or can lead to studies that will verify the predicted results.

Some examples of prescriptive questions include:

  • Will we increase sales if we lower our prices? [2]
  • How do I maximize employee productivity? [2]
  • How can we decrease our carbon emissions?
  • How long should our store stay open each day?
  • Will the graduation rates increase if we make higher education admissions tests mandatory?
  • How can we decrease the patient waiting time in the emergency department?
  • What should the price of our product be?

Prescriptive questions may or may not suggest a potential action that the decision-maker plans to take. For example, "will we increase sales if we lower our prices?" includes a potential action that we will analyze: a reduction of prices. But a different question like "how can we decrease our carbon emissions?" does not include any actions and is asking for a list of candidate actions that will most likely decrease the carbon emissions. This means that an additional step must be taken in our strategy in order to develop a list of candidate actions.


I hope you enjoyed this data analysis primer. Stay tuned for next part, where I will share strategies for selecting the right technique to answer descriptive questions.

Sources:

[1] https://online.hbs.edu/blog/post/diagnostic-analytics [2] [https://www.pragmaticinstitute.com/resources/articles/data/32-business-questions-for-data-analysis/](https://www.pragmaticinstitute.com/resources/articles/data/32-business-questions-for-data-analysis/) [3] https://www.pragmaticinstitute.com/resources/articles/data/32-business-questions-for-data-analysis/

Photo by Gia Oris on Unsplash

Tags: Data Analysis Data Science Descriptive Analytics Getting Started Predictive Analytics

Comment