4 Questions to Ask Yourself Before Working on a Machine Learning Model
Machine Learning (ML) is not a magic stick that you can touch on all your problems and expect a reliable solution.
Advancements in ML in terms of both accuracy and speed have made us blatantly approach any problem with an ML-based solution in mind.
This is a dangerous mindset, which is likely to produce unprecedented results. I can assure you that you do not want unprecedented results in production.
In this article, we will go through 4 questions that need to be addressed before considering ML as a solution to your problem.
1. Do you have training data that have similar characteristics or patterns with the data for which you want to make predictions?
ML is not magic. It cannot produce without being trained first. Hence, the first and foremost requirement in any ML system is data. Before considering if we can apply ML to a problem, we need to make sure we have access to data.
But, the data we will use for training a model needs to have patterns similar to the data we want to predict.
Let's say we want to train a model to make movie recommendations on a platform. If we train it with data prior to 2000, it is highly likely to fail to make good recommendations because people's taste changes over time.
In some cases, data does not have any pattern. It follows a pure random process.
2. Is there a simpler solution?
Some of the problems are so trivial that you don't need an ML-system to solve it. In such cases, simpler solutions should be preferred because it takes relatively more time and money to implement the ML-based solution.
Consider a sales forecasting task and you have two solution candidates. One is done by taking the moving average based on preceding days and weeks. The other one is an ML model with dozens of features computed using a large amount of data.
If the former provides a good enough solution or is outperformed by the ML model by a very small margin, you should probably choose the simpler moving average model. Spending extra time and money on the ML-model may not be worth the small improvement. Moreover, once you decide to scale, the cost of deploying the ML model may increase significantly.
3. Is it cost-effective?
This is related to the previous point of choosing between a simpler solution and a complex ML model. However, in this case, you only have an ML-model as the solution to our problem.
Even if ML is the only solution, it may not be something you want to embrace. You need to pay close attention to monetary rewards.
Creating an ML-system and deploying it into production costs money. If you work with large amounts of data, which is typically the case, costs dramatically increase.
Collecting, storing, and processing data as well as training models on the cloud might be a gigantic expense.
It comes down to comparing the value ML provides for your business and the cloud bill you receive. If you spend thousands of dollars to operate an ML-system on the cloud and your business benefits very little from it, then you should probably look for a better solution.
4. Can you afford mistakes?
Even if you create a state-of-the-art model that performs very well, there will be errors. No ML model will be 100% accurate.
So the question is if you can afford making mistakes. Think about cancer detection on X-rays. This is literally a case of vital importance. Would you solely trust an ML model on this task?
ML model can be used as a supportive document but cannot be trusted to make the final call.
Final Thoughts
Machine learning is a highly capable tool that helps solve numerous problems in a variety of businesses. However, it is not the go-to solution for every task.
The questions mentioned in this article need to be addressed before investing time and money in an ML-based solution.
Machine learning is great but not always your best friend.
You can become a Medium member to unlock full access to my writing, plus the rest of Medium. If you already are, don't forget to subscribe if you'd like to get an email whenever I publish a new article.
Thank you for reading. Please let me know if you have any feedback.