I Built An AI Human-Level Game Player

Author:Murphy  |  View: 26857  |  Time: 2025-03-22 20:04:13

A while ago, I built the decision-making part of a human-level automatic player for a game.

The automated player for a board game was so strong that nobody I knew (including myself) was able to consistently beat it, so much so that I had to make it dumber in a controlled way to make it enjoyable for casual players.

In this post, I explain how I programmed the "brain" of this little game using standard techniques from the AI playbook before Deep Learning. You'll learn (in case you are not already familiar with it) how to build Minimax trees and how to develop heuristics for board evaluation. The Minimax technique can apply to any adversarial game regardless of the specifics of the game I describe here.

Enter Kromate

When mobile devices started to become popular (around 2008, after the launch of the iPhone), there were few available board Games. And some of the games were not really playable on the tiny screens of the time. That was, of course, the case of chess: you can't really display a chessboard in a 3.5-inch display with a resolution of 320 x 480 pixels. You can, but it's going to strain your eyes.

One of my brothers came up with the idea of a very small board game, with less than 20 positions (instead of 64 in chess) for placing only 6 colored "tokens" (instead of 16 in chess). He came up with rules similar to those of chess, including "check" and "mate."

The game was called "Kromate" because it used colors to identify the tokens.

I'm afraid I have to explain in some detail how to play Kromate because otherwise, you won't understand how the automated player solved it.

Consider the board in this figure:

Screenshot of the Kromate app for iPhone (edited)

The board has 19 places for 6 colored tokens. Each of the two players has 3 tokens: one player has the "primary" colors (red, blue, yellow), and the other one has the "secondary" colors (orange, green, purple). Secondary colors are formed by mixing two primary colors (this perhaps took you back to your years in primary school).

It could seem weird that the board has 19 places instead of, say, 20, but if you look at the arrangement of places in the figure, it's like a bee hive with rows and diagonal paths.

Look at the board in the figure above. That board can be correctly displayed on a tiny screen, don't you agree? Even the original iPhone could show it in all its glory.

Initially, the tokens are placed randomly on the board. This is one reason why Kromate is not practical as a physical board game (you'd need, for instance, to "throw" the tokens and then see which is the closest place to them – it's just not practical).

But what are the red arrows in the picture? Well, that's the chess-like part of Kromate: primary color tokens can "attack" secondary color ones and vice versa when they are "aligned" with them. For instance, in the figure above, yellow and blue tokens are "attacking" the green one because they are aligned with it, and yellow with blue gives green.

It takes two primary-color tokens to attack one secondary, and with two secondary-color tokens, the other player can attack one of the primary colors: for instance, purple and orange tokens can attack the red one because both purple and orange contain red.

Tokens can move in a straight line, either horizontally or diagonally, at any distance, as long as they don't encounter a token obstructing the way. However, one condition is that the place where the token lands is not attacked, as defined above.

For instance, in the figure, the green token has to move away because it is attacked (in "check"). It could move to the upper left and one or two places to the upper right. Apparently, it can also move horizontally to the right, but it's not allowed because the blue and yellow tokens also attack that place.

There you have it. Now you can play Kromate.

Wait, but when does the game end? This is where "mate" is defined. Consider the position here:

Screenshot of the Kromate app for iPhone (edited)

The secondary colors are in "mate" because the green token is attacked, but it can't go to any place where it's not attacked. The player with the primary colors wins.

The Kromate iPhone app

Building the complete app for Kromate was a bit too much for me alone (I had a full-time job as an AI researcher), so I contacted a company specialized in iPhone development. It was a very peculiar company,

I found some Russian developers based in Austin. They looked legit, and they had already released many titles on the Apple Store. The arrangement, which we negotiated over the phone, was that I'd program the decision-making part (deciding which move to play next), and they'd do everything else, including the user interface and everything related to the operating system, like launching the app, giving it an icon, and so on.

We signed a contract without ever meeting in real life.

This arrangement implied that I'd just program a procedure taking a Kromate position (encoded in a way we agreed) and deliver back the next move to make.

The Minimax technique

I followed a technique called "Minimax tree," which was invented a long time ago by an AI pioneer named Arthur Samuel. Samuel was an IBM employee who followed some ideas proposed by John Von Neumann. He developed a very good automated player for the Checkers game and introduced the notion of a "heuristic board evaluation" function. Let's check this one.

The heuristic board evaluation function gives, for each valid board position, a real number indicating how good the position is (the higher the number, the better). One problem is that there are no general recipes for defining board evaluation functions, and you have to get inventive with whatever knowledge you have of the game in question.

For instance, in the case of Chess, the function would give points for having pieces, and not every piece counts the same (a tower has more value than a bishop, and so on). The word "heuristic" is a shorthand for "rule of thumb," but it sounds a lot better. In simple terms, for a heuristic board evaluation function, you have to find a way of giving high value for good game positions and low value for bad ones.

Then, Samuel used a game tree to organize the search for different alternatives.

First, you have to consider the legal moves you can make. Each move would be a "branch" from the "root." But for each one of these branches, the other player can do a number of moves (unless there is a "mate" position), which will also be branches. This branching process can continue until a "stop" condition is met.

There could be different stop conditions; the simplest one is to consider a fixed number of tree "levels," that is, a number of player turns. This is the one I took for Kromate, with just two turns for the automated player (AI's turn, the other's turn, then AI's turn again).

Though the game tree organizes the different possibilities in one place, we need to use it to make a move choice for the top level.

The last part to consider is how to obtain the best move from the root. That's where the Minimax procedure is handy. Let's assume the game tree in the figure:

Image by the author

Initially, at the root, my position is P1 (I say "my position" taking the point of view of the AI player), and I have two possible moves, leading to P2 and P3; then, it's the other player's turn. From position P2, the other player can choose a move leading to positions P4 or P5, and so on.

There are leaves at the bottom of the tree that have not been expanded. We have to evaluate those board positions using the heuristic function mentioned above.

Below, we have the same figure but put an evaluation from the heuristic function:

Image by the author

Now comes the interesting part of Minimax: propagating the evaluations from bottom to top. We start by taking the maximum of the evaluations below a position because it's my turn to play:

Image by the author

Continuing the way up, the other player, assuming it's rational, will choose whatever option that gives me the worst evaluation (yes, the other player wishes me the worst), as so:

Image by the author

Though we can propagate evaluations to the top (the tree's root), they are not needed here (except to see what the best winning odds are). We can, at this point, choose the best move, which is from P1 to P2.

In short, the Minimax technique is like "looking ahead" for several moves, which is something good chess players do all the time. Instead of just evaluating the resulting positions from your first move (which means statically evaluating P1 and P3) and choosing the best of them, the Minimax player explores what goes next – to a certain extent – and only then chooses the move leading to the best static evaluation.

Don't think that Minimax is all there is about tree search. Many refinements were proposed along the way, such as alpha-beta pruning, for cutting entire branches from the search tree.

The Minimax on Kromate

When IBM's DeepBlue beat Kasparov in the famous chess contest in May 1997, the computer examined millions of options before making its move. It wasn't brute force, though, as it included techniques like alpha-beta pruning to reduce the search tree.

Kromate, on the other end of the spectrum, examined less than 100 positions at the leaves of the search tree. The Minimax tree for Kromate only contained two turns for the AI player and one for the human player, exactly as in the figures above.

The Kromate search tree looks like a radical simplification, but it was surprisingly effective. The idea of using the number of possible moves to evaluate the position simply nailed it. It made sense because if you have a few options, you are closer to a "mate" position – where you have exactly zero options. The more options you have, the farther you are from losing.

Initially, I thought of combining many factors to evaluate the position, which is a common practice. However, as I was refining the details of each factor, I realized that almost all the other factors were way less important than the number of possible moves.

At this point, you, the reader, can see that designing a static evaluation function isn't easy. It could be an intricate process, and there's no warranty that it will end well. That's why only when I saw the terrific performance of the AI player did I realize that the static evaluation function was good.

A human-level AI

I programmed the whole Kromate AI player as a function in Objective C that returned a Kromate move as a pair of two numbers: the token to move and the destination place.

To be frank, when I completed the automated player based on nothing more than the good ol' Minimax tree, I didn't expect it to have anything above a basic performance. But then, I played a number of games with it, and it seemed to me the AI player was strong. Of course, my subjective impression of the AI Kromate player could be influenced by the fact that it was my creation, so I could have the "pet effect," softening the rigor of my testing.

But every single person I gave the game to test was unable to beat it – in some cases at all; other people could beat it a few times while losing most of the matches.

Honestly, I didn't have the resources to conduct a systematic evaluation of how the AI Kromate player compared to humans, but I can tell you most people found it so hard to beat that they didn't find the playing experience amusing anymore. You don't want to be beaten 10 times in a row!

The AI player was so capable that I had to "dumb it down" to make the game more engaging.

I could have dumbed down the Kromate AI player by reducing the depth of the Minimax tree, for instance, but I didn't want to make a substantial modification of the code and wanted to offer several levels – including the hardest level – so I came up with an easier solution.

The "baseline" Kromate player randomly selected one of the valid moves. Actually, I programmed this one before the AI player. Needless to say, the baseline player was easily beaten by a human player.

So, I "mixed" moves from the AI player with moves from the random player. It was very easy to adjust the proportion of each one by modifying the probability of taking the random or the AI for each individual move.

I ended up offering 5 different levels, including the random baseline at the bottom and the AI player at the top, as well as 3 intermediate levels with a probability of 0.25, 0.5, and 0.75 of choosing the AI player.

Then what happened?

Very little. My Russian associates published Kromate at the Apple app store in late 2009 with little success. It seems that few people wanted to learn a new chess-like game from scratch (even with online help that was included). Then, they packaged Kromate in a game bundle and another bundle, and later, they dropped it altogether.

With my demanding university full-time job, I didn't pursue the Kromate adventure any further.

But I learned a few things.

One is that until you get your hands dirty with the minutia of the code, you don't fully understand what you are talking about. In this case, the Minimax method had to be translated into a series of Objective C function calls, starting with the generation of the possible legal moves for each token in each position.

Another lesson was that the most crucial part was the definition of a good static position evaluation function, which doesn't appear in any textbook.

A final learning was that you don't know whether your clever position evaluation function is any good until you put it to the test. In actual games.

I hope you find my modest journey into building a human-level AI player interesting. Perhaps it can inspire you to build something else now that we have much more powerful tools!

Get my personally curated AI news analysis and tech explainers with my free newsletter, "The Skeptic AI Enthusiast," the current tech landscape seen critically by an AI veteran, at https://rafebrena.substack.com/

Tags: AI Artificial Intelligence Game Development Games Thoughts And Theory

Comment