ChatGPT and Current AIs Are Dumb

Ahh free tier stuff is problematic for evaluations IMO, especially without thinking enabled. It’s hard to know what’s going on with chatgpt behind the scenes due to model routing and automatic cost reduction measures they take. and chat providers have different prompts and tools available in chat interfaces that can significantly alter things (gemini 3 has a big section in it’s prompt to make web stuff look way better, similar to calude’s frontend-design skill)

Also, I note you have GPT-5.3 selected not 5.3-codex or 5.4. Theo (t3) said recently he thinks openai are moving away from the -codex fine-tuned models going forward which hopefully will make this distinction clearer.

I’ll try 5.3-codex and 5.4 later today and see what they come up with.