The Minimax Algorithm - Tic Tac Toe Game



 

A player's move is only as strong as the opponent following weak move

I built a Tic Tac Toe game (...locally know as "X and O")  once with a random algorithm which I devised from my experience as a player but takes a lot of lines of code and yet it was unable to defeat most players because it was only as smart as the developer. Then I came across the minimax theory and it gave me a clear understanding of how I can make my A.I smarter than me. I knew about the theory way before I became a programmer, in high school to be precise; The New Further Mathematics Project Part 3, But due to the circumstances then I could not figure out how useful it was. I tried to understand the basis and implement it in my previous game and it worked, though a risky assumption was made which is that the other player also plays optimally. A smart player can use that to his advantage but it still take a lot of mind to win over the Algorithm.

The minimax principle is a decision theory suggested by an Hungarian - American mathematician; John Von Neumann which implies that in a competitive game, each player will act to maximize his minimum gain or minimize his maximum loss, perfect decision on either will result in a win for the winner. It is a powerful theory used to find the best move for a player in a two-person zero sum game where the gain of a player equals the loss of the other, and can get very complex in implementation if proper care is not taken. It has several ways of implementation, but the method that I think is generally more understandable is the binary tree method.

The principle is pretty simple if you take it step by step. Take an instance of you playing the tic tac toe game with someone, from your perspective, you will act to get the maximum gain possible for each of your move i.e you will tend to go for a move that guarantee a win for you. And still from your perspective, for the other player to win, he will act to make sure you get the minimum gain possible for each of his move i.e he will tend to go for a move that guarantee a loss for you, because a loss for you is a win for him. The algorithm will play the game for every possible move available for every state of the game like the other player would play when it gets to his turn until the last state is reached, and finds out which move will result in a win ending for the player. Still from your perspective, if you gain $10 from the other player every time you win and lose $10 to him every time you lose, that is the utility value of the game meaning if the game ends with +$10, it is a win way for you, 


...and -$10 if it is a loss for you...


 ...and $0 for both of you if neither of you win.

Pseudo code:

for each row in the board:

    if all element in row == player:

        return 10

    elif: ell element in row == opponent:

        return -10

    elif all element in column == player:

        return 10

    elif: ell element in column == opponent:

        return -10

    else:

        return 0


 

At clearer view, the principle tells you that whenever you have a tic tac toe game board and either of the two player made the first move, if it is your turn to make a move, step aside a bit away from the game board and try each of the move somewhere else. Once a move is selected, a certain number possible moves will be open for your opponent to take, at this point select each move for the player also and after that, possible moves will be open for you again to take, and go on with the tree until you arrive at several end-point, where the game board is full. Note that each of the end-point will either be a win for you, a loss for you or a draw for both. Then choose the end-point that gives you a win and trace it back to your original point where you are to choose a move in the main board. On getting there, you will know which move will guarantee you a win. Note that the moves that guarantee you a win may be more than one, in that case pick one of the moves and go on with the same process again whenever it is your turn to play.


Take for example the image above, it shows a tic tac toe game in a certain stage. You as the player has three possible moves available for you which are move 2, 3 & 4, and you are to choose a move that best will give best assurance of wining. Using the minimax principle, if you follow route 2, you will have a sure chance of wining immediately, but if you choose route 3, there will be other two possible moves left for your opponent (5 & 6) Route 5 will result to a loss of -$10 and route 6 will result to a gain of +$10. Since it is your opponents turn and we already assume that he plays optimally i.e he plays with all his might to see that you loose he will choose the route that give you the minumum value at that state or in another form he will try to minimize your maximum gain, therefore route 3 acquires a value of -$10. Same with route 4, so the only wining route left is route 2.

Let's take one more example:


According to the minimax principle, Route 1 acquires a value of -$10 making it unfavorable for you. Route 2 acquires a value of 0 which is a draw and partially favorable for us. Route 3 acquires +$10 which is best favorable for us, so taking the route three is the best move for you at that state of the game. 
Another case to consider is if there are two possible ways for your to win like the diagram above, but it takes longer to win in Route 2 and it gives the opponent more chance to disrupt our wining strategy than Route 1.

Route 3: You can win 1 move
Route 2: You can win 2 move
So, we can subtract the depth length from the value returned from the endpoint to get actually game value of a route.

Route 3: +10 - 1 = 9
Route 2: +10 -2 = 8
MAXIMUM BETWEEN 1 & 2  = 9 (Route 1)
Pseudocode:
function minimax(board, depth, isMaximizingPlayer):

    if current board state is a terminal state :
        return value of the board - depth #if value is not equal to 0
    
    if isMaximizingPlayer :
        bestVal = -INFINITY 
        for each move in board :
            value = minimax(board, depth+1, false)
            bestVal = ( bestVal, value) 
        return bestVal

    else :
        bestVal = +INFINITY 
        for each move in board :
            value = minimax(board, depth+1, true)
            bestVal = min( bestVal, value) 
        return bestVal 

 You may have different implementation for my above explaination, if your code is perfectly implemented, your Algorithm will never be defeated. It will always win over any player.

The minimax function only analyses a move after it has been taken. We need a function to temporarily make a move and call the Minimax function for analysis on that move if it will be favourable based on the returned game value.

Pseudocode:

function findBestMove(board):

    bestMove = NULL

    for each move in board :

        if current move is better than bestMove

            bestMove = current move

    return bestMove




An additional features that you may add is Alpha-Beta Pruning , which reduce the runtime of the Algorithm by a large factor . It is based on the the process of getting ride of unnecessary route that cannot make any changes to the final point, thereby reducing the amount of route to cover during runtime.

I hope I made an easy explaination of the Algorithm enough for you. Feel free to drop comments, ideas, complains by the comment section .

Source code @ https://github.com/Mustorpha/Tic-Tac-Toe-Game-Series


Adiós 🤪🤪🤪

Next Post Previous Post
4 Comments
  • Anonymous
    Anonymous October 9, 2022 at 1:35 AM

    Cool and nice 👍☺️

    • Mustorpha O' Jamiu
      Mustorpha O' Jamiu October 9, 2022 at 7:14 AM

      Thanks 🤝

  • Anonymous
    Anonymous October 10, 2022 at 6:34 AM

    Thank you for this one, Mustorpha!

    • Mustorpha O' Jamiu
      Mustorpha O' Jamiu October 10, 2022 at 6:39 AM

      Glad you like it 🤝

Add Comment
comment url