Feature Engineering that Makes Business Sense

When it comes to Machine Learning in a business application, you will most likely implement the model you built in collaboration with business stakeholders: you are building them a tool that will improve their process or their targeting.
Unless the business area you are helping is analytically advanced and has a good understanding of machine learning (this is rare), convincing the stakeholders that your model makes sense is paramount.
However, deep learning and tree-based models like XGBoost or Random Forest models are quite the black box. Thus, giving insights to your stakeholders into what your model does – or rather what influences the model – is key. Tools like SHAP plots are very useful for you to understand what features are predictive, and their directionality (lower/higher value of feature = positive or negative impact on prediction). But how do you decide on what sort of feature you want to focus your engineering efforts on?
You could add and multiply every possible combination of features together and plug this in your model, but how likely is it that you'll create a predictive feature? And how will you ensure your features represent a behavior or value that makes sense?
In this article, I will outline three different ways you can create meaningful features:
- Ratios
- Trends
- Granularity

1 – Ratios
Features representing raw values (counts, dollar amounts…) are the best place to start and are often the default values you find in the data. However, they tend to hide or downplay some information. Complementing these features by their respective percentages could significantly improve the predictive power of your model.
Examples:
- Number of Online transactions.
This is an example of a feature used to assess online engagement. Higher value means higher online engagement. But when we use it as an absolute value, this could be misleading. Let's look at Customer A and Customer B who both have done 5 online transactions in the last 3 years. They would have the same level of online engagement if we measure it as an absolute value. But customer A could have done 100 transactions in total and customer B could have done 5 transactions. In this case, the proportion of their respective Online transactions would be 5% and 100%. This tells a very different story: Customer A is not very engaged online and Customer B is an online exclusive customer.
- Spend (or item count) in a specific Category.
Similarly, if you are trying to assess the engagement of a customer with a specific category of products, taking this as an absolute value doesn't put the information in context. Expressing the spent in this specific category as a percent of total spend would tell a more detailed story.

2 – Trends
The values present in the data are often a snapshot of a metric at one point in time, or maybe the summary of that metric over a specific period of time. This removes a lot of information about how the metric changes. By trying to create features that capture trends of those metrics over time, we might be able to offer patterns to the model that improve predictive power.
Examples:
- Total items purchased in the last 3 years
Looking at aggregated values over a long period of time can be helpful as it smooths noise, especially for industries and customers with infrequent purchases. However, the loss of information is considerable: have we observed any change in this metric over time? Has the customer been purchasing the same number of items consistently, or have they been consistently increasing over time? or consistently decreasing over time? New features that capture some of the change over time could be as simple as "change in # items this year vs last year, last 6 months vs previous 6 months, last month vs the previous month, last month vs the same month last year…". You could even get a little fancier and try to capture the trend by fitting a regression line of choice and get the parameters of the line as your features describing the trend: for a linear regression, your new features would be the slope and intercept of the fitted line for each customer.
- Total Spend in the last 3 years
Similarly, we could add information – and likely predictive power – by adding features representing trends in Spend. How much more/less has a customer spent last month compared to the average of the previous 6 months? Now that you already have features representing ratios, you can look at how these trend over time: "change in % spent in category X this year vs last year" would tell how the customer is stepping away or becoming more exclusive to that category.

3 – Granularity
When a feature shows high predictive power, it's worth investing time to explore variations of this feature that could be even more predictive.
Can we be more precise about what this feature is trying to measure?
Can we be less granular about what this feature is trying to measure?
Examples:
- Age
Age is an obvious strong predictor for likelihood to get a Credit Card – when you turn 19 you become legally allowed to use one. Age is often represented in years in data. When we observed that Age was the top feature for the Credit Card model we built, we thought that we could potentially get a better feature if Age was expressed in days – and it turned out to be true! Customers are even more likely to get a Credit Card the closer they are to their 19th birthday, and by taking the age in years we were missing out that level of detail.
- Account Balance
When looking at historical banking data, the balance of accounts is often a snapshot (balance at the end of the month/week/day's last day). This can hide the volume of money movement that is happening in the account in the specific time period. Creating features that capture this volume could be critical: "total credit/debit " could tell a story of activity in the account that balance alone can't.
Conclusion
Here you have it – by expressing your features in meaningful ratios, by measuring trends over time and by zooming in or out or your features, you can significantly improve the predictive power of your model while keeping your features explainable to your business stakeholders.
If you're interested, don't hesitate to check out my other articles about Data Science in Business:
Customer Attrition: How to Define Churn When Customers Do Not Tell They're Leaving
Machine Learning in Business: 5 things a Data Science course won't teach you
Customer Segmentation: More Than Clustering
Elevate Your Data Science Career: How to become a Senior Data Scientist
References
[1] I.Choudhary, All You Need to Know About SHAP for Explainable AI? (2023) https://medium.com/@shahooda637/all-you-need-to-know-about-shap-for-explainable-ai-8ad35a05e6ec