Why ChatGPT Can Code in Python but Can’t Play Tic-Tac-Toe

Why ChatGPT Can Code in Python but Can't Play Tic-Tac-Toe

It sounds like a joke: an artificial intelligence that can pass the bar exam, write a functional Python script for a web scraper in seconds, and compose a sonnet in the style of Shakespeare… is routinely beaten by five-year-olds at Tic-Tac-Toe.

But if you’ve ever tried to play a simple game of Xs and Os with a Large Language Model (LLM) like ChatGPT, you’ve likely witnessed this bizarre phenomenon firsthand. Within three or four moves, the supposedly brilliant AI will confidently place an ‘O’ on top of your ‘X’, invent a completely new corner of the board, or fail to notice that you won two turns ago.

Why does an entity with the sum of human knowledge at its fingertips struggle with a game that has only 255,168 possible outcomes? The answer lies in the fascinating, fundamental difference between how human brains and neural networks perceive reality.


The Illusion of the “Smart” Player

To understand the glitch, we first need to look at what happens when you challenge an AI to a game. If you type, “Let’s play Tic-Tac-Toe. I’ll go first. I place an X in the center square,” the AI will usually respond with a perfectly formatted, polite response, perhaps even drawing an ASCII art board for you.

But as the game progresses, the wheels fall off. The AI exhibits what researchers call a spatial reasoning deficit. It might say:

“Great move! I will place my ‘O’ in the top-right corner.”

…while simultaneously rendering a board where its ‘O’ is inexplicably placed in the bottom-left. Or, worse, it will try to take a square you already claimed. It’s not cheating; it’s profoundly confused.


Why Tic-Tac-Toe Breaks the AI Brain

To understand why this happens, we have to look under the hood of how Large Language Models operate.

1. The 1D vs. 2D Problem (The World is a String)

When you and I look at a Tic-Tac-Toe board, we see a 2D grid. We instantly understand spatial relationships: up, down, left, right, diagonal.

An LLM does not have eyes, and it does not have a “mental canvas.” It only understands the world as a one-dimensional, sequential string of text.

When the AI draws a board, it’s not drawing a picture. It is generating a sequence of characters, line breaks, and spaces. To the AI, the board looks like this:
X | - | - \n - | O | - \n - | - | X

To figure out if it has won diagonally, it has to magically infer spatial geometry purely by counting characters across different lines of text. For a text-prediction engine, this is computationally unnatural.

2. The Tokenization Trap

As we covered in our previous article on The AI Counting Paradox, LLMs don’t read words or letters; they read “tokens.”

When the board state changes, the token representation of the board changes completely. The AI isn’t “looking” at a board and moving a piece; it is trying to calculate the statistical probability of what text characters usually follow the text characters of the previous board state. Because a half-finished Tic-Tac-Toe board isn’t a common string of text on the internet, the model’s predictive engine begins to hallucinate.

3. The Goldfish Memory (Lack of True State)

When you play a game, you keep a running tally of the board in your head. An LLM doesn’t have a persistent “memory” of the game state.

Every time you send a new prompt, the AI has to re-read the entire chat history, re-process the text string of the board, and try to deduce what is happening from scratch. It’s like playing a game where, before every single move, you suffer total amnesia and have to deduce the rules and the current state by reading a transcript of the last five minutes.


So, Why is Python Easier?

If Tic-Tac-Toe is so hard, why is writing a Python script so easy?

It comes down to syntax versus state. Programming languages are entirely linear and text-based—exactly what LLMs are built to master.

  • Logic in Text: Code is heavily structured by syntax rules that translate perfectly into one-dimensional tokens.
  • Training Data: The internet contains billions of lines of code. There are far more examples of working Python functions in the AI’s training data than there are ASCII-art text transcripts of half-finished Tic-Tac-Toe games.

When ChatGPT writes Python, it is predicting the most logical next line of text based on massive amounts of data. When it plays Tic-Tac-Toe, it is being forced to use a text-prediction engine to simulate a 2D spatial physics engine—using a tool for a job it was never designed to do.


The Tic-Tac-Toe paradox is a brilliant reminder of what AI actually is. It is not an artificial human intelligence; it is an alien intelligence. It operates on entirely different underlying physics than biological brains do.

The next time you’re amazed by an AI writing a complex essay, ask it to play a game of Connect Four. It’s a wonderfully grounding reminder that while AI is incredibly powerful, there are still a few things the human brain can do that a trillion parameters of code cannot.