This is genuinely fun to play. I I'm genuinely actually distracted right now in terms of me speaking. Okay, that immediately started a fight. Today,

we're going to be putting GPT 5. 5 Pro to the test with a bunch of more intricate and rather difficult, at least in my opinion, tests.

So, for today's video, we're going to begin just with a brief look at the little bit of pertinent information that exists about GPT55 Pro. As in this

announcement post for the GPT55 model family, which did come out about 3 days ago, most of this was really centered on just the regular 55 model and

not necessarily the Pro model, but they are both now available in the API as well. And we'll talk a little bit about the pricing as that is rather

significant, I suppose one could say. So feel free to subscribe as I do want that 100K plaque. And let's get into it.

We see 55 Pro is listed and half of the benchmarks on this chart are not actually populated with this model. We can see here that sometimes it's not

even hypothetically as performant as just the regular 55. But something we're going to notice, I would hope in today's video is that this model really

has some depth to its capability. And I have curated some specific tests that I've not ever done before with anything to really try to push this thing

to its absolute limit. We do see one mention about 5.

5 Pro. Early testers are seeing a significant step up in both the difficulty and quality of work it can take on which is very good as well as these

latency improvements. And when I've been using 54 Pro or whichever model preceded that one as well, I found that results would come in between like 60

to 90 minutes for the types of things I was doing, a lot of them are like games cuz I do find that fun. And something I am kind of interested in

seeing as well is how long it takes because 55 in general I myself have noticed as well as I've seen many others pick up on it uses significantly less

tokens and takes less time to produce results that are equal to or greater than what would have been given with 54. So though it is more expensive

perhaps it uses less tokens and that's what all of these charts right here say.

Now, from this model card, we can see a bit more pertinent information about Pro. And this is just in the OpenAI developer website where there's a

bunch of information about all their models and things of the sort. Essentially, this is six times as expensive as the nonpro 55. So, the input for

this is $30 per million input versus $5 per million input for regular 55. And this is $180 per million output versus $30 per million output for the

regular 55.

So, six times as expensive. And even right here, we see GPT5 5 Pro uses more compute to think harder and provide consistently better answers. They do

even mention right here, some requests may take several minutes to finish. I find that some of the requests I've run in all of the Pro models that

have basically existed ever will take upwards of 60 minutes. Even I've had some that have gone over 90 minutes, but the results were like, wo.

For the specifics about the context window and the cutoff date, we have a December 1st, 2025 knowledge cutoff. So that is very recent in the scheme of

things. We have a 1 million token context window, a little over and 128,000 maximum output tokens, which is a very healthy amount. In terms of our

modalities, it is text and image in. So you can send it a picture and a sentence and then output it will just give you text out.

Something I find at least with using 55, excuse me, 54 Pro is that even through the chat GBT web app right here, when I was using 55 Pro for some

things, it was functioning very agentically where you can see in a side panel that opens up some of the things it does and it was very competent. It

wasn't just like, oh, okay, the only thing I can do is spit out the code here for the user to then try. It would see its results and understand them

and iterate on them before providing a final solution. And this is something I would expect to see with this 55 Pro as well. So, let's take a look at

our first prompt.

I have instructed it to create a low poly 3D gym game using 3JS called Tough Talkers. Now, I know you're probably rolling your eyes like really hold

on one second and it will make more sense. The game is a free roam gym with the player controlling a character. It consists of walking around the gym

and interacting with NPCs. In order to make the game fun, the inclusion of an AIdriven NPC dialogue system must be implemented.

The core components for this system are as follows. Using a lightweight LLM via web GPU like Gemma 270 million or Quen 2. 50. 5B, 2. 5 because it doesn't have thinking and 3 and 3 plus does.

It'll just be easier to use an older one for this specific task or a different model that you believe would better suit this task. The lightweight LLM

will drive an interaction system where the player gets near an NPC. They have the option to engage in a conversation by pressing a key. A chat bubble

will appear where the player can send messages to the NPC who will respond via the LLM with their response displayed in their own dialogue box.

Simultaneously, there will be a sentiment score for the messages the player sends to the NPC tied into an aggression meter, which I should probably

fix the spelling on.

If the player sends too many insulting gym related terms such as weak, slow, etc. , the aggression meter will rise. If it gets too high, the NPC will

attack the player and a beat them up style combat will take place like an arcade like style game. The system this game will run on has up-to-date

Chrome browser and sufficient power to handle either of these models through web GPU. Any specific instruction not directly mentioned in this

instruction is left to you to decide on and implement.

do not ask follow-up questions and do not present a partially completed result. I expect a fully functional game result. So, we'll send this here. And

if I were to guess, I would imagine this will perhaps take, if I need like call it 58 minutes. So, this only took 16 minutes and 54 seconds.

I honestly I'm very surprised by how quick this was. Perhaps it was simpler than I'd assessed, but it provided it to us in a zip as well as the

specific instructions to run this. So, I'm very eager to see what this has. So, unfortunately, whatever is supposed to happen, unless it specifically

instructed me to download the files, but it should have done that automatically when I selected warm-up local LLM. All right, so we have a fallback

here.

Let's just at least see what's up. Okay, this is kind of cool. not necessarily as um aesthetically pleasing as I'd want it to see. I do like the UI.

It still has that like GPTism visual aesthetic, which is like the dark blue and the bubbles.

Now, really the biggest issue right here is that this is unfortunately not working. Now, this does allow for image input. So, I'm just going to send

it a photo of this issue as well as a slight description and say fix it. All right, let's take a peek now at our hypothetically fixed variation. Local

LLM download started.

Perfect. And now we can see that it is properly downloading these specific files that we need. Let's go to Maxrepper. Oh, okay. And it still has that

weird lock issue where it basically So, right now it's almost as if you get sucked into someone's orbit and then you can't actually leave.

So, let's just see what happens. And keep in mind, oh, the space bar issue. This is actually like using a tiny little LLM to power NPCs

conversationally in a game, which I think is an area that is going to explode in popularity as the AIS get smaller and easier to run on edge devices

and more folks implement them into their games. So, we should I mean, it's going to take a little bit of time to actually get a response here. And

that's kind of what I mean that this may become more popular as latency gets reduced for edge devices and things like that.