Linear Programming Optimization: The Simplex Method
Up until now, this series has covered the basics of linear programming. In this article, we are going to move from basic concepts into the details under the hood! This article will cover the simplex method, which is the algorithm that is often used to solve linear programming problems. While we will solve a simple linear programming example by hand with the simplex method, our focus will be on the intuition of the algorithm rather than memorizing the algorithmic steps (we have computers for that kind of stuff!).
Here is what we will cover:
- Why the simplex method is needed
- Moving from graphical solutions to algebraic
- Demonstrating how the simplex method works with a simple example
Here is the link to the list that contains all of the articles I've written so far for this series:
Why the simplex method is needed
In the first article of this series, we went over how the attributes of linear programming allow it to only consider the corner points of constraints as potential optimal solutions. This is a very powerful feature that narrows an infinite solution space to a finite solution space. In the examples we reviewed, we only had a few constraints and a few variables – we even solved some of them by hand! After looking at some simple examples, one might assume that an exhaustive, brute force (looking at every single corner point) approach to searching for the optimal solution is sufficient.
While, for smaller LP problems, the brute force method is fine, as the problems get more complicated, the number of corner points can explode. The number of total corner points (including feasible and infeasible) is a function of (1) the number of variables and (2) the number of constraints in the problem.

Below are some examples of the growth of the number of corner points for multiple variables and constraint combinations:

The rapid growth of the corner points necessitates a smarter way of finding the optimal solution – enter the simplex method! The simplex method is a clever way to look at only a small subset of all of the possible solutions while still guaranteeing a global optimal result – we'll get into the details a little further down.
Moving from graphical solutions to algebraic
When learning the basics of linear programming, it is really nice to look at a graphical representation of an Optimization problem with two decision variables. We covered this in part 1. While this works wonders for developing an understanding of the basics, it doesn't extrapolate well to more complicated problems. Its usefulness is essentially limited to instructional purposes.
To further our understanding of LPs and get to a point where we are ready to start discussing the details of the simplex method, we need to understand how to transfer an LP problem into an algebraic format.
One of the main components of getting an algebraic set up is transforming the constraint inequalities to equalities. Outside of the context of LPs, there isn't a good way to do this mathematically. But remember how for LPs, we are really only interested in corner points? And all corner points are on the lines? Since we actually only care about the borders of the solution space, we basically ignore the inequality! Again, this is because we wouldn't worry about the area not on the border lines of the space anyways.
Nice, let's just switch all of the inequality signs to equality signs and we are good to go, right? Not so fast! When we consider corner points, we aren't usually on all of the lines at the same time. Because of this, we need to add a way to ‘hop' on and off of specific constraint lines. This is a little hard to explain in words – ironically, I'm going to explain this concept… graphically.

Okay, so now see why we can remove the inequality for equality – but we have a big problem. Let's explore this problem by looking into the math of a specific corner point:

Let's plug these X₁ and X₂ values for all of the linear constraints:

We see here that the more strict equality causes these values of X₁ and X₂ to not be viable options for the third constraint. We need a way to make it possible for us to be at the corner while having the variable values work in all of the constraint formulas.
The solution to the problem above is to create slack variables. Slack variables are added to each equality and give us the flexibility to only consider the constraints involved in one corner at a time.
We can ‘hop' to a corner point by setting the slack variables to 0 for the constraints involved. For the additional constraints that are not involved, we can set to slack variable to whatever the difference (or the slack) is between the decision variable side of the equation and the constant side of the equation.

If the slack variable is set to 0, then the constraint is one of the constraints in the corner point we are considering. If it is anything other than zero, we know that the specific constraint is not involved in the corner point under consideration.
Okay, with our inequalities set to equalities and our slack variables created, we are ready to solve linear programing problems algebraically. Let's now get into the details of how the Simplex method uses this set up to efficiently optimize!
Note: When we were discussing the number of possible corner points in the previous section, n includes slack variables, not just the original decision variables.
Demonstrating how the simplex method works with a simple example
First and foremost, the simplex method is an algorithm that efficiently moves from corner point to corner point, calculating the objective values of the corner points until it finds the globally optimal solution. The simplex method is a greedy algorithm that moves to the corner point that increases (for maximization) or decreases (for minimization) the objective function the most. Greedy algorithms are typically infamous for finding sub-optimal solutions – but, because of the characteristics of linear programming, the simplex method is guaranteed to find the optimal solution.
I'd like to take some time to build intuition on why a greedy algorithm, like the simplex method, can find a global optimal for linear programming. The defining attribute of a greedy algorithm is that it makes optimal decisions given just the immediate information. Often, a series of local optimal decisions do not lead to a globally optimal solution – but, as I mentioned, this is not the case for linear programming. The reason it holds is because of the linearity requirement for the constraints. Let's look at an example of making optimal decisions with linear functions.
Imagine we are at the intersection of two functions – we want to select the function that will give us the highest y-value for the next x-value.

The simplex method uses a very similar approach to move to different corner points in a greedy fashion. One of the keys to the simplex method is that it terminates when no more better adjacent corners are found. This corner is both the local and global optimum because of the convexity of the solution space.
Intuitive Example
With a general understanding of the set up of the simplex method and a basic understanding of what it does (greedily iterate through corner points to find the globally optimal solution) – I think we are ready to start working through an example.
So far, we've discussed three constraints, but we have not established an objective function. To do that, we'll come up with some coefficients to give X₁ and X₂ and we will add all of the slack variables with coefficients of 0 (this makes the optimization tableau complete).
So here is the problem we will be solving:

Let's get this set of linear equations simplex-ready. We'll start with the constraints, which we've already discussed above. This is what they will look like:

We have not discussed how to modify the objective function to fit into the simplex approach. It is very simple, we set the objective value equal to Z and the transform the function algebraically to where 0 is on the right hand side. This looks like this:

Note that we added 0's for the slack variables, they add no value to the objective, but this formulaic set up will be helpful as we start to build out the simplex tableau.
The simplex tableau is a data table that is used in the simplex method to traverse corners and calculate their objective function values. It also has implements to stop exploring once the optimal value has been found.
The tableau has one row for the objective function and one row for each constraint. At any point in the algorithm's execution, there are basic and non-basic variables. Variables switch between these categories as the algorithm progresses. Basic variables are variables that are not set to 0, you can think of them as variables that are active. Non-basic variables are variables that have a value of 0, you can think of these as deactivated. The simplex method always starts at the origin. At the original, all decision variables are set to 0, which means that only the slack variables are basic, i.e., non-zero or "active." The starting tableau will show the slack variables as non-basic at the beginning.
Below is our example's starting tableau:

Okay, now that we have our initial set up, we are ready to start our first iteration of the simplex method. We need to start moving away from the origin – we do this by moving one of the slack variables to non-basic and pulling one of the decision variables into the basic solution set. But which slack variable do we take out, and which of the X's do we bring in?
Let's first figure out which variable we want to make basic. Remember how we said that the simplex algorithm is greedy? This is where this comes into play! We look at the coefficients for the z-row and select the variable that has the most negative coefficient. This is a maximization problem, but remember that we switched the signs of the coefficients to get the 0 on the right hand side, so negative = good, positive = bad. The most negative coefficient is tied to the variable that will increase the objective function the most right now. Our options are -5 for X₁ and -3 for X₂, so we select X₁. This is called the entering variable – thankfully that is a pretty intuitive name!

Now, let's talk about finding the slack variable to make non-basic or de-active/remove. In simplex terms, the variable that is being removed is (also intuitively) called the leaving variable. We want to find the maximum value that we can set the entering variable to, without violating any constraints. This means we need to find the slack variable that is associated with the tightest/most restrictive constraint on X₁. We can do this by taking the ratio of the ‘solution' column and the constraint's' X₁ coefficients. X₁'s constraint coefficients are -2, 1 and 1 – the corresponding ‘solution' column values are 5, 7 and 10.

Here's the intuition behind this selection. We are trying to find the maximum value that X₁ can take while being consistent with the system of linear equations it is a part of. For the first constraint, really anything goes because the X₁ coefficient is negative – the other variables can compensate at any level to make the overall equation equal to 5. Obviously ‘anything goes' is not a very restrictive constraint