Find Hidden Laws Within Your Data with Symbolic Regression

Author:Murphy | View: 22443 | Time: 2025-03-22 22:54:56

As machine learning practitioners, we usually have a dataset (X, y), and we want to find a function M— also known as a model – such that M(X) ≈ y. Typically, we do not care about the functional form of M. As far as we are concerned, **** our model can be a neural network, a tree-based algorithm, or something completely different – as long as the performance on the test set is good, we are happy.

However, if we use complex models like these, we might miss out on interesting patterns, maybe even fundamental physics or economic laws within our data. In order to do better, I will show you how to build models using symbolic regression. **** These models have the property that they consist of only a few terms and can be (re-)implemented easily wherever you want. Let us see what I mean by that.

A Physics Experiment

Let us assume that we are an experimental physicist and want to find out how long it takes for an object to reach the ground when we drop it from some height h. For example, if you drop an object (that is heavy enough not to be influenced by air resistance) from a height of h = 1.5 m, it will take about t = 0.55 s until it reaches the ground. Try it out!

However, this is only true for Earth, or other celestial bodies with a gravitational acceleration of g = 9.8067 m/s². The moon, for example, has a gravitational acceleration of 1.625 m/s², and dropping the same object there from 1.5m would take a longer time of about 1.36s, which should align with what you know from movies.

Now, our task is to find a general formula t(h, g) that tells us how long the object needs to reach the ground. This is nothing more than building a model that takes the values height h and gravitational acceleration g and predicts time t. Dear physicists, please bear with me.

Tags: Evolutionary Algorithms Machine Learning Mathematics Python Symbolic Regression