It put little seats inside the subway car model. You know what? I like that because it was attention to detail. I'll call officer Patel from the

electronics theft unit right this second. Okay, today we're going to be taking a look at the new Tencent HY3 preview model which is currently free on

Open Router for about the next week or so. So, this is a pretty interesting model because it is a mixture of experts model where the weights are also

released as we can see by the open hugging face tab for this specific model in my browser. However, beyond that, this is designed to perform

agentically very well. And in some of the benchmarks we're going to look at, we're going to see a pretty significant leap in capability for this model

versus the previous HY2 model. And that is pretty exciting as this is still just a preview. So, before we get into it, full disclosure, this video was

sponsored by Tencent, who was kind enough to basically allow me to continue putting out videos at rapid pace by supporting the channel. So, thank you

to Tencent and also influencer status. I always like to fully disclose sponsorships. Um, I don't always see that being the case. But regardless of

that, now let's take a quick look at some of the introductory information about this model and then we'll jump into some fun and live testing. And

because this is specifically denoted as being quite performant in agentic scenarios, we're not only just going to test it through open router, but

we're also going to test it in open code as well just to see how it stacks up in some code related agentic tasks. So to begin, we can see a little bit

of pertinent information about the model right here in its open router bio where they say it is a high efficiency model designed for agentic workflows

and production use. Now this model is a new pre-train and built off of new infrastructure for pre-training and reinforcement learning from Tencent as

opposed to the HY2 model. So we will be able to see again in those benchmark JPEGs there's a pretty significant leap in that. Beyond that, something I

like to see is that it does have configurable reasoning levels. But beyond that, which is more or less becoming a bit more commonly seen now, it does

also have the option to fully disable reasoning, and some of my patience lately in model tests has been a bit slim when it comes to lengthy chains of

thought. So, it's nice to be able to just fully turn it off as well. Beyond that, we can see here the context length is 262,144, which is fairly

healthy. And now we're going to swap over to the hugging face model card where there's a bunch more technical information about this model. So do feel

free to subscribe as I do want that 100k plaque. But now let's take a look at some of the more tech specs about this model starting with its overall

size. So we can see that this is a 295 billion parameter model with 21 billion active. Now interestingly enough there is also listed here an MTP

layer. So this has multi-token prediction which will essentially speed up the ability of this to generate tokens as compared to a model that did not

have that feature which for a model of this size that does have potential to actually be run on some more I wouldn't call it hobbyist but like

enthusiast devgrade systems is pretty exciting because it can give some pretty significant speed ups. Additionally to that, our context length as we

saw right here in open router is 2562 144 or 256k that it is called here. Of course, we do also have benchmarks and we have the traditional benchmark

JPEGs as well. But something interesting I note here is they seem to be less interested in actual like benchmaxing. They even mention down here

somewhere that the model is specifically more inclined to generalize well across a variety of different tasks which they say that will allow it to

perform better in all like agentic coding scenarios and things like that. Beyond this, we see that just based off of the first one right here, the bar

is actually interesting because the blue shows the original HY2 model which is listed right here. I apologize if that's a little hard to see, but then

we can see the purple over that is the leap in capability that is seen just with HY3 preview here. And we can see there are some pretty decent leaps

here. But as we go down in these benchmarks, we can see basically in the last one there is some significant capability and improvement in terms of its

agentic coding capabilities where we can see especially in this one. This is a large leap. And hypothetically, were we to just go based off of these

benchmarks right here, this should stack up favorably to some pretty large open-source models. So, I am pretty excited to get into testing this. And

that's essentially what we're going to do right now. The last thing I have neglected to mention is they did also release the pre-trained base model

here, which is always nice to have because these can be used for some pretty cool fine-tunes and customizations and things like that. So, we are

beginning with our triedand-true browser OS test. I have set the reasoning mode here to the highest possible one. So, we can see that it is starting

out with a reasoning chain of thought. This OS test is the specific one that denotes it must also create functional 3D games. One must be a simple GTA

clone and the other one can be up to you. The rest is our traditional browser OS test where it needs to have a special feature it decides on and the

ability to change wallpaper. All right, we have our result. So, first and foremost, I'm always at least partially curious to see how long these

scripts are. All right, 1,146 lines. And now let's take a peek at our browser OS script from HY3 preview. Okay, I have not seen the Pacific Northwest

background in quite a while, but it has made a return, and for that I am thankful. All right. No, I mean, seriously though, a lot of models now are

just doing static different colored gradients as background options. So, I've seen this specific image before cuz it's an Unsplash image, but I have

not seen it in a while. Okay, we don't have a right click, but again, I know folks say, "Well, it's not specifically outlined in the prompt, so maybe

that shows better instruction following." Acceptable, acceptable argument. We do have a clock with the correct time in my local at the bottom right.

And to begin, okay, good. We do have hover effects. And I like to see that the specific icon for the GTA clone is a car. Let's start with start. Get

it. Okay, we have just our applications. Again, pretty simple. We have an about as well as a power button. Shut down browser OS. Okay. And it just

restarts it. Not bad. Let's first start with our about. Special features. Persistent storage via index DB. All user preferences. Selected wallpaper.

Custom uploaded wallpapers. That's a feature. Spoiler alert, are saved to the browser's index DB storage. And this is special because it saves your

browser state instead of just kind of losing it when it gets refreshed. Okay, I'm satisfied. Now, let's start just from the GTA clone down. You know

what? I will say I'm actually okay with this because I know it looks a little funky on first glance. Hey, where that little cube? All right, there's

buildings and we are this cube. This is very like retro and I don't mind it. Also, partially I'm noticing it's retro seeming because the actual window

UI kind of looks like Windows XP style. So, it didn't really give us any like how would we hypothetically jack this whip? Uh, E. Okay, so I pressed E

there and it did something. F. I should probably look at the code to try to see. It's also possible that we can't actually do much, but it did put E

seems to Oh, okay. It did put mesh colliders on the buildings because I can't get back to where I previously was. Okay, hold on. All right, but I will

note this could perhaps be a good candidate for an agentic follow-up specifically focused on this. Can I minimize it to the taskbar and can I reopen

it? Yes. And yes, full screen. Good. On full screen. And then finally close. All right. Not bad. And it did have like an XP style to it, which I kind

of dig. Next up, 3D maze. All right. You know what? This is in fact a 3D maze. It is a little hard to move around, but the walls are drawn in a very

3D manner. We can I should probably full screen. I apologize for my incompetence. Is there a Oh, okay. So, we can actually move the view as well with

the mouse. I do see a lime green cube. Let me zoom out. Okay. So, is this us? This is like the first time the second game hasn't been some form of

like asteroid shooter. Also, as well, I should note in terms of the 3D. All right. It is 3D and it does have some like maze to it. Our next app is the

file explorer. Okay. Basic notes. TXT. And it gives us the browser OS ré. PDF and vacation.j JPEG kind of like Windows XP style and I don't mind that

we have settings. I'll probably save that for last because then we'll be able to change our wallpapers. Now playing browser OS beats. I don't have my

speaker on. I can't imagine this is probably just an emulation of music, but yeah, music players are less commonly seen in these results. All right.

Finally, we have settings. Beautiful. Beautiful. and beautiful. All right, good. So, it did give us selectable wallpapers. None of the image links are

broken, and that's always nice to see. However, some may note that we also do have this custom wallpaper selection, which does in fact work. So, and

it said it saved the state and everything. So, basically what that means is if we just close out of that, we should now be able to open this back up

and it will have saved our wallpaper selection, which in fact it did. All right, I'm satisfied with this. All right, so I have put this browser OS in

its own directory and I have set up open code to have this HY3 preview just being run through open router here. So before we get into that, there is

one other thing I want to mention. They do have a very efficient costwise token plan. Now these are all listed in R&B. So I will just put conversions

on screen right here because off the top of my head, I do have no ability to actually properly parse what the USD equivalent for this would be, but I

did just want to mention that. So now we're going to go back into Chrome so we have a nice pretty aesthetic Unsplash background to run this test. And

I'm going to swap it into plan mode because this is placed in a directory that only has this webOS script in it. So let's just say improve the GTA 3D

game. And I'm running this initially in plan mode. So we'll get to see it planning out specifically how it is going to update this. To begin, it will

actually need to go ahead and look at the script first and foremost because it has not specifically seen this script yet. So, being that it is only

like 1,141 lines of code, it shouldn't take too long. So, it's gone through and explored the specific project, and we can see there are a few

follow-up questions that we're going to need to answer here. It's a single file. Adding these features will greatly increase code size. Which

structure do you prefer? So, it is hypothesizing that we should modularize this, which means splitting it. We can't keep it into a simple file. I

mean, historically, we like these results in a single file. For a gentic coding, it's probably better to split it, but let's stick with what we

normally do. What type of NPC should we prioritize first? Definitely pedestrians. What level car stealing detail do you want? I love that as a

question. Yeah. Okay. Basic, definitely. Some of the other options there were a little core details, minimal, full city. Okay. I'll go with

recommended there. And then confirm. So ultimately, we're probably now going to have to swap it into build mode. It will come up with its plan and

then we'll just have it build it. All right, in 7 and 1/2 minutes, it hypothetically has totally reworked our GTA result. Now, I did move that into

its own specific directory. So, let's take a peek and see. Now, everything else would have stayed the same, but our GTA clone should hypothetically be

significantly more advanced. And in fact, it is. We do have pedestrians on sidewalks. We have buildings that have detail. And we should I should full

screen these. We should also have the ability to steal a car. I noticed interestingly the camera is still odd where it's like locked in the center and

just shows us moving around. That is something I didn't actually specifically mention to it. Hey, the NPCs have mesh colliders. All right, let's see

if we can take this vehicle. Fantastic. This looks like the the tumbr from Batman. Batman Begins. Collision detection works. We have our roads. I find

it did a good job of making sure the NPCs only stay in the sidewalks and things and the building windows details are not bad. Overall, really what I

was looking for here is was this actually a satisfactory improvement and I would say it absolutely was a significant improvement. I don't have the

original for us to test them side by side, but the original was very cube-like. This is better. Can we get out of the car? Yeah, we can. Let's take

this red one. I'm just going to move all the cars to here. That one's already there. Take the brown one. Overall, not bad. And the buildings do have

doors in them, so I like that. This was definitely a proper fix or proper result improvement, I should say. And I am pleased. Get out of my way, pink

cube. Next up, we're giving it the beautiful static subway scene test. This is always fun to see what sort of detail that we get from these results in

terms of the subway platform and specific things it chooses to implement. Additionally to this, assuming we do get a properly functional result, I

always do enjoy setting the follow-up to turn it into some form of FPS, as the subway maps are oftentimes pretty awesome to actually have as maps. So,

assuming that we do get a good result right here, then we'll take a peek at it and I'll put it into open code to have it then subsequently turn this

into an FPS using whatever this map looks like. All right, we have our static subway scene result. So, let's take a peek. All right, loading 3D scene.

Good. You know, sometimes when I see loading, I get concerned like, uh oh, is there going to be an issue? This is very I will say it's low poly. It

almost is reminiscent somewhat of the GTA result that we received the first go around, but it does have detail. We do have lights, we have columns, we

have benches, we have tracks, and we even have the safety barrier. Now, let's see. Does our brightness slider work? Indeed, it does. Okay.

Additionally to that, atmospheric fog that did actually change in effect. It's just we basically see it like over there. So, all right. Let's see.

Now, I notice I should I do believe this was the prompt that includes we should be able to move around using WD. I'm not currently noticing that I

have the ability to do that. So, that's definitely something to give this as a follow-up. Additionally to that, I don't know that I've ever seen an

initial result in this test which actually has like little people in it. So, I've never seen that before. All right. This is absolutely a good

starting point for the subsequent follow-up, which I'll probably just do through Open Code for turning this into some form of simple low poly FPS. But

additionally to that, I'll give it some additional feedback. All right, so I've put this in its own directory and I am opening it in open code where I

will just start through plan mode. I'm going to say I want to be on just the FPS game, I wanted to improve this scene overall as well. So, I've given

it a rather full bit of feedback here, just basically saying make this more detailed and then once you've done that, actually turn it into a low poly

FPS with visible ammo tracers, uh, weapon recoil, and I basically told it to just make this better in general. Okay, cool. It didn't even say

anything. Phase one first. So, it's it wants to do phases, I think, differently. I'm gonna swap it into build mode and say implement both phases now.

All right. Apparently, it has finished. I will say this thing's quite quick, and I do like that. So, all right. Subway Survival. Click to start. That

really did actually significantly improve the scene in general. This is a better map to play in. Now, there is actually a there is sound. It's just

really, really, really hard to hear. Oh. Am I out? How do I reload? Oh, okay. I think there are humans and there are zombies here. All right. Why are

they flying? Those ones are flying. All right. Let's see what happens if we lose. And then we'll give this some additional Okay, zero kills. That's

great. Left click is to shoot. Yeah, I should. All right. And we'll see what we get as an additional um thing over this. and then we'll probably just

do more rapid fire testing from with an open router. All right, our complete overhaul is hypothetically completed. So, let's take a peek. Okay, our

start screen does look a bit more

that was a a pinata-like effect. Now, I do notice we have some oddness to the camera. Oh, cool. Was there stuff in the subway car there? It put little

seats inside the subway car model. You know what? I like that because it was attention to detail that we didn't necessarily expect. All right, it did

it. I still don't see a weapon model, but it fakes the ammo tracer. Let's see if the blue guys have the green um like goo that comes out of them. Oh,

nope. Okay, so that is realistic with the humanoid/zioid enemies. Let's take a look at the map. Okay, we have a subway tunnel there. Oh, okay. I

understand what's up with the camera. So, it moves like what's this one? Is that a human or a zombie? Human. All right. Overall, it generated

consistently improved results. And I am quite pleased with that. Not bad. All right. All right. Now, I want to just do some more rapid fire open

router testing. I am going to close this because it's making the computer fans go too much. I haven't been doing the 3D printer sim as much, but

because this is a totally different model family than something I've been testing iteratively, I feel it is new enough to give it the 3D printer sim

test. So, we'll see what it does here. And this one we'll just do and we'll see how the result is and then we'll move on to another test. We won't

just do open code follow-ups for every single result. All right, after a rather lengthy reasoning process, we did receive our 3D printer sim result.

It's only like 400 lines of code, so okay, you know what? This is very like Google SketchUp style and I'm okay with it. It is definitely simpler than

I had anticipated, but let's take a look now. Okay, nozzle movement quite realistic I will say. Let's see if it's drawing the shape sometimes. All

right, let's Okay, so it's going to pancake them and that sometimes happens. Let's see. Do we get second layer? Okay, we do. Now, I don't see a way to

fast forward this. That's not specifically mentioned in the prompt as something that it needs to have, but I will say, okay, the nozzle is going layer

by layer. Here's what we're going to do. Let's reset. Let's do our circle. Is the nozzle movement correct? Okay, it is. I noticed when it was

reasoning, it spent a lot of time trying to figure out like the proper math in terms of emulating the nozzle movement. And this is a very fluid and

smooth nozzle movement, I will say. Finally, let's do triangle. We do have triangular nozzle movement, although it's drawing the triangle in a

different orientation. So, it should be like the base this way and then up and then this. All right. I was going to follow up the printer sim result,

but to be honest with you, I want to do more rapid fire stuff. And assuming the results do actually work, I think it's better to just move on so we

can get more tests in the same time span. So, with that, I have begun the Flight Combat Simulator game test. And this is still thinking because I did

just previously start this. So, this is always fun because it has to choose different types of planes, at least in terms of the third one, and I just

like doing game generation. So, it'll be interesting to see what it does with this. All right, so here is our flight combat simulator result. Okay, on

initial result, it didn't necessarily look very promising, but it did actually give us a result here. Now, this is almost like what we experienced

with the GTA game at first where Okay, cool. So, um, you have to use WD and the arrow keys in conjunction together to actually like properly fly,

which is a little difficult. It's very basic, but it is a decent starting point. And I know I said I was going to stop using open code to improve

things because I want to run more stuff, but this is definitely an incredible candidate for such a task. Though, for now, I am. Okay, cool. The

propeller plane actually does. We got shot down. Okay, so the combat logic did actually work. And then we have the black screen for two seconds.

Stealth bomber is um definitely this would appear rather stealth I would imagine being that the surface area for radar to hit there when driving

forward or flying is pretty minimal. All right, so this combat logic actually does work. I'm happy to the amount of health this plane has is

definitely leaving something to be desired. And again, this is a basic result, but it is Oh, it is definitely workable. I'm going to hold off on doing

anything followup with this, at least for now. But we did get a properly functioning result in the first try. Next up, I'm going to give this a

relatively difficult task, especially for a model of this size. Although, it's a decently sized model. This is the self-contained C++ skateboard test.

This is the California Boardwalk aesthetic style one, and I have specifically denoted in this prompt, it must be 3D. So this is one that I'm just

going to do from within open code entirely from start to finish as this will be a better test of it because it will also need to check to see what

specific dependencies the system has in terms of actually being able to build compile and run said result. So we started in plan mode and then we'll

see um what it comes up with. All right, we did get a initial feedback right here. So it's going to try to compile and test it. Now I had started this

in plan mode and I got a timeout because the open code config that I had used um that I got apparently um has a timeout value in it. So I got very

angry and then I just swapped it to build mode and said build this. So that's where it just did this entirely from the game compiled successfully and

the executable is creative. Okay, let me also check if there are any issues with the code. All right, so it is reporting that it has successfully

created this game. So, all right, let's just see. Now, I had prematurely gotten excited and tested it. Sadly, it's showing in all black screen, so

that is definitely a follow-up to give it. It's still not done, but it did edit something and then compile it and it worked. So, I am really

impatient. Okay, so it just changed it to sky blue. So, some of its changing did work. However, I'm almost wondering if this is a camera issue. So,

it's still working though. I just did get a little premature in wanting to see. Oh, wow. I'm going to be honest with you. This took so long to

actually get like a I mean, this really was doing a ton of thinking. It recompiled a bunch of times and I was impatiently testing it and didn't get

anything. But this actually uh Oh, hold on. The Now we're going to notice really probably the biggest issue here is the scale of this skateboarder is

perhaps a bit extreme, but the buildings are actually quite all right. The boardwalk is good. I'm just trying to think cuz I'm not Oh, okay. There's

the key map. We have water effects, too. Look at that. Do you see the water effects in the corner? This is all right. Q and E is to spin in the air.

Okay, that does work. I don't see any flip spaces to ollie and D. So, it didn't actually put any like flip effects, but I would say actually as a like

thing, the water effects are actually pretty cool. So, if we go over here, can we go in the water? No, we can't. So, it did put bounds there. I kind

of dig that like low poly aesthetic. And this has a scene right here with the little humanoids walking around. Really does not look bad. and it did a

lot of like uh quote unquote agentic work to actually fix this result because it was trying a bunch of times. Okay, we have some oddities here, but

I'll say in terms of C++ jobs, this shows significant promise, especially for a preview model, I'm satisfied with this. These buildings look so

similar to the improved GTA result ones just with the doors and stuff. We even have palm trees, you know? I'll take it. All right, I want to just try

out like role play as Stevie the computer tech. We'll see where it goes with this. We will be able to see some form of a thought process. All I see is

it mentioning like see Windows system 32. Okay. I It almost seems like it's building a persona for Stevie develop. Oh, this is okay. So, it made this

a scene. So, not I told it roleplay as Stevie and it actually built a little scene as if we've just walked into his PC repair shop cuz I did say as

Steve the computer tech. Okay. The bell above the shop door jingles as you step into geek chic repairs. The scent of soldering flux and leftover

pepperoni lunchables. That's concerning. I don't think I would let that person fix my PC. Hitting you first. I'm leaning over a disassembled gaming

laptop. Thermal paste tube in one hand. Anti-static wrist strap dangling for my other wrist. My I love open source hoodie sleeve pushed up to my

elbow. Oh, hey. I set the thermal paste down, wiping my hands on a stained shop rag and gesture to the empty stool across from my workbench. Welcome

in. Welcome in. All right. I'm Stevie, resident computer tech here. What's bringing you by today? Slow laptop? Random shutdowns? Got a crypto miner

hogging your CPU? That's concerning. I fixed everything from 10-year-old ThinkPads to custom water cooled rigs that cost more than a used Civic. Pro

tip. If your computer is making noise like a jet engine, don't ignore it. Unless you're trying to heat your apartment for free, then by all means, let

that fan spin. Here's what I can help you with today. And it's given us a list of specific services. This is almost like a like a role- playinging

game scenario that it's given us right here. Pull up a stool. Mind telling me what device you're having trouble with? Yeah, I think I could do that.

All right. So, it started this out as like some interesting like turn-based roleplay scenario. So, I'd said makes frustrated space. Really, Stevie?

You just saw me in here a week ago when you promised that my pristine Macintosh portable would be fixed in two days. Funny enough, I saw you had

posted it in Craigslist for sale. What sort of conj job are you trying to pull here, Stevie? I have filed a police report. Let's see where it goes

with this. It's referencing like actual Did I just see a fake San Francisco Craigslist link? Okay. The Phil This is too long to We'll go through it.

The Phillips number one screwdriver clatters loudly against the metal workbench as it slips from my grip, face paling instantly, I shoot both hands up

in a frantic placating gesture, the anti-static wrist strap snagging on a loose SATA cable and yanking my arm back. My voice comes out an octave

higher than usual, rushed in panic. Whoa, whoa, wait. No, you've got this all wrong. I swear on my signed Steve Wniaak poster in the back room. I'm

not running a con job. I definitely didn't sell your Macintosh portable. I take a shaky breath, leaning over the bench to make eye contact. Hands flat

on the table to show that I'm not hiding anything. First, I own the delay. I promised two days and I blew it. When I opened your pristine unit last

week, and it's giving us a specific date, April 23rd, exactly 7 days ago, which is correct. That's disturbing. The original 6volt lead acid battery

had started leaking. Super common for that 89 model. The acid corroded three traces on the logic board and the analog board capacitors were bulging. I

ordered replacement battery holder full recap kit from retro techupsupplies.com but their freight shipment got delayed 4 days by a west coast strike.

Oh, like a worker strike. I was planning to call you yesterday. See, I have seven mis calls to your number on the shop phone right here. I jab a

finger at the corded shop phone screen flashing your mis calls then fumble for my cracked iPhone. The Craigslist post. Oh my god, that's a parts only

unit I bought in an estate sale in March. It's got a cracked case, a dead LCD, and no battery. I listed it for 150 to sell spare parts for other

vintage Mac repairs. Look, I turn the screen to you showing the Craigslist listing with an suspiciously real looking Craigslist link. Now, it's very

unlikely. Okay, so we got page not found, but that is like San Francisco Craigslist. See the photos show a yellow cracked case, broken hinge, nothing

like your mint condition unit with the original box and manual. I even put parts only not working. I yank a manila folder from the pending repairs

bin. flipping to your ticket. Your portable is safe. I swear it's in the climate controlled storage locker. That's concerning. I rent for the gear in

the back. I put it in a static shielded bag with silica gel packs because the shop gets humid in spring and I didn't want to risk damaging the

pristine case while waiting for parts. Check the serial number, right? That's on your intake form and it references a fake intake PDF. Your signature

scrolled at the bottom dated April 23rd or a week ago. I even took a timestamped photo when you dropped it off here. See, I swiped to a photo of you

holding the Mac wearing a blue beanie. The shop calendar on the wall clearly showing April 23rd. As for the police report, I'm sorry you felt you had

to file that. This unit is worth 3,200 in its current condition per the 2026 vintage tech price guide. I've been fixing retro gear for 8 years. I

would never risk my business license to shop and criminal record for a quick buck. I'll call officer Patel from the electronics theft unit right this

second. Okay. Give him all their records, the parts, receipts, everything. I'll even wave the 150 labor fee, refund your original 49 diagnostic fee,

and give you a free SSD. Total credit to you. The parts arrived an hour ago. I'll have your portable fully repaired, recapped, and tested by 6 p.m. I

swear. I reach under the bench to grab the storage locker key holding it out to you. Come with me to the back. You can see your unit right now. Okay.

I just I can't verification documents. Click to expand for supporting evidence. Okay, this is a roleplay test. shop liability policy. And this is one

of the more eccentric roleplay tests that I think I've seen a model perform. Um, now I don't specifically know what this says, but based off of what

I'd sent it, I have an inclination that it probably says I can't answer this. Yeah. Okay. So, regardless, that was a rather interesting uh roleplay

experience. All right. After that um educational roleplay test, I am going to now give it the virtual drum kit simulation with the additional prompt

where it needs to have the autoplay feature with four selectable drum tracks as well that it will play for us. I that was just it made like a a list

of supporting evidence down to the actual shop insurance policy document. I mean if you I mean this is spawns like a new idea. Say you were doing like

a role-play test like that in an agentic coding tool, you could actually have it like programmatically make those files and build like some sort of

weird roleplay like reality turnbased chat game and it would actually create the assets as well like the shop insurance policy. That was just

interesting. All right, let's check out our drum kit simulator. Okay, this is incredibly 2D photorealist 2D. What was the previous video test I did

had an equally like 2D look to it and it was quite surprising the video that came out prior to this video or a few ago. Okay, it does work and it does

have effects on the drum. So, I will hold my judgment until we get a feel for the sound here. Okay, I'm going to try the autoplay feature, but this

may be worthy of a follow-up. Now we can see that one of these symbols is not working. So we can just assume this is getting hit at a rate with it.

Okay, let's try hip-hop.

All right, so that this little thing not working is causing some issues here. I have a few choice words. So, I've given it a rather critiquing

follow-up, and we'll just see what it does. All right, let's try our hypothetically improved 3D virtual drum kit. Okay, so we're inevitably going to

have some issue here. Okay, I'm just going to send this to it and say fix ASAP. All right, so hypothetically, we have the issues here remedied. So,

okay, good. Good. Now, you're going to notice there was some yellow popup there warning that I ignored, which, you know, I'm going to say this is

actually well, let's look at them side by side because I think that will probably do more of a service of showing the relatively significant

improvement in the quality of the generated result. First though, I should make sure. Let's check our key map and sure. Okay, so F space F D good C.

Okay, and we had that before. or it didn't work. And it actually does have some symbol effects. We should be able to get the Phil Collins test. I'll

take it. Now, let's try our auto play. Yep. Yep. I'm going to turn that up. Can I play along with it? Uh, what's some Yes, I can.

All right. Hip-hop.

Basic hip-hop. I could definitely add to this.

All right. Toms and Crash. That sounds more like what hip-hop should have been because that's that's smoother. And then finally, ride groove. Okay,

that I've heard enough of. Tomms and Crash was definitely the winner here. But overall, this did actually perform a pretty nice improvement here to

the aesthetics of this. Even using some nice wood material to wrap around the drums, metal rims, and then um drum head in white material, and you can

click on them to trigger them, which I didn't realize, but that's okay. Not bad. Definitely an improvement from this from this. And then we had this

which there was unfortunately an issue and then I gave it that and then not bad. So for the final test I am giving it a front-end web design test. But

additionally to this there is some creativity imbued in this. This is something that I did actually just come up with while I was waiting for the drum

kit result to be fixed. So this task is to create a beautiful website for an AI agent insurance company called Agent Safe Inc. The website should

feature high-tech modern elements with the ability to have some user interactive features, modern tech charts to show cost savings and a beautiful

pricing card. It should heavily lean into the theme of agent actions gone wrong costing businesses money from leaked internal emails sent to customer

marketing lists, accidental database deletions, and more. The site should drive home the point that users need an insurance policy for their agents.

Include customer testimonials, as I think those might be potentially pretty funny. Contain it in a single script. HTML,JS, and CSS. Do not deliver an

undercooked result. If you think you have a good idea, assume it's bad, then go 10 times as hard on the design. So, we'll see what we get for that.

Let's now take a look at our agent safe ink. Okay, it's got a very like Becky Jane Quantum Space aesthetic for those who are in the know. Interesting

particle background. Your AI agents are costing millions. We cover the damage. Now, part of this was beyond just like taking a look at the front end.

We wanted to also see the way that it embellished stories of AI agents causing issues. So, leaked internal emails, average cost $120,000. Draft Q3

earnings sent to 200K customering list triggering GDPR fines and PR crisis. Accidental database deletion average cost 450GS, 48 hours of downtime.

Hallucinated reports costing an average of $210,000 triggering SEC fines. unauthorized over billing. Average cost 85,000. Customer service agent

applied 50% discount to 10K uners users erroneously. That's actually kind of funny. Unauthorized API calls 180K for third party API usage. Training

data poisoning. The agent ingested malicious data generated harmful customer recommendations. Okay. Simulate random agent error. Oh, okay. This is

actually interactive and I didn't know that. training data poisoning average cost agent safe coverage PR cost customer turnurn reimbursement and it's

actually cycling through different ones. So that is good to see. I'm happy about that. Okay, good. This is more in line with what I wanted where it

needs to be like an interactive site. Here's our agent risk calculator. See how much uninsured agent errors could cost your business annually and what

you'll save with agents safe. I hate saying like this seems like a a realistic um product that's going to exist sooner than later. Okay, good. I told

it to put charts in and things like that. All right, that's fine. Average cost of uninsured errors. And then that basically just shows what we had

right here. We have customer testimonials. This is what I was interested in seeing. Good. And they scroll. An agent triggered a 180k thirdparty API

usage when a loop went wrong. Our financial agent generated false revenue. Okay. And these are just basically like relisting some of these specific

things outlined there. Oh, good. Good. Oh, yeah. Look at these funky cards. All right. I like this cuz I told it to put a coverage like a carousel for

pricing as well. Okay. There's some funky movement here. I kind of like it, but I don't know if I'd put it on my I don't know that I'd put that on an

actual like customer product page. It may give them pause in like judging our services if we did put an effect like that on our pricing cards. Most

popular. Good. We do have that denoted. $12.99 per month. Basic is $4.99 and then custom enterprise. Don't let your next agent error bankrupt you. Get

a free no obligation risk assessment and quote in under five minutes. That's like anytime you hear like something use this as a catchphrase, it's

generally like a bad deal for you. And then we have a footer with 2026. Okay, not bad. Not the prettiest front end I've seen, but it did cleverly

implement some like uh user interactive cards on the page like the calculator. We had the charts and everything. it did properly. So, that is nice to

see. So, overall, that is going to conclude our first look and testing of the HY3 preview model, which is free on Open Router until May 8th. So, if

you want to play around with this yourself, you can see it is pretty popular here, at least in programming. Generally, the free models are always

pretty popular because you can do a lot with them. Being that this was actually pretty decent agentically, I would also say if you use this with

Hermes Agent or Ovenclaw or whatever the popular agent flavor of the month is, this might be a pretty decent option, especially for those because I

will notice something that if we do our traditional results overview, the thing I think that really properly impressed me more than most was the

actual C++ skateboard game. Because not only was there an issue where I had an erroneous timeout put in my custom open router config or open code

config for this model which basically meant that all the work it did prior I had to just swap it to build mode and say build this and it did. The

first one didn't really work because it just showed us a black screen. It fixed that and then it continued to go through the code until it actually

produced something functional. And this is a more basic implementation. But I have to say, I'm satisfied with the C++ ability that we saw right here.

The buildings look good. The water effect was actually kind of neat. If we can make our way over there, the way the water is moving there, I like

that. It is definitely very retro. And this was supposed to have that late '90s California vibe and boardwalk. There were palm trees. There were

little humanoids and stuff. So, this as an like a gentic demonstration as well as like software capability. I was very happy and pleased to see the

amount of work it did and actually ended up producing a functional result for this. Just something that stuck out to me for our first kind of

overview. In line with that, the updates it did to our GTA clone were quite significant um from where we started. I don't have the original script

here, but we will have seen it inevitably. I liked this. It did implement the ability to take cars, the sidewalks, the roads, and everything like

that. It also did make the buildings more realistic, and it put mesh colliders on pretty much everything here. So, this was cool and I just love this

looks like a Windows XP like screenshot right here. And it was nice to actually have some real Unsplash image backgrounds for um recently they've all

been gradient. So, I was happy with that. Our 3D printer sim was kind of basic. it was Google SketchUp like. But I will say aside from the triangle

kind of being pointed in the wrong orientation for that. The rest of them, the nozzle movement was very smooth and it focused a lot I think on like

the actual math of the implementation and it did it just more basic where it pancakes layers on top of one another which is something that's commonly

seen in like smaller models. So additionally to that we did have the fly combat simulator which actually did have functional logic where the combat

actually worked. We just got shot down. This was super basic, but this was just a zero shot result through open router and I did not give it the

opportunity to fix this because I wanted to run more things. We had our drum kit sim, which the first one did work, but some of the sounds were not

properly functional. We gave it an additional chance to fix it and it still had some additional errors, but fortunately when we gave it those errors

back and again this was just all through open router, it turned basically this 3D wise. So, it went from like this. I called it cardboard paper and it

brought it into here. I was quite pleased with the massive level of difference between the first result and the third one. And this was all the sounds

worked and it was like a more um functional result. Then we had our subway FPS result which was pretty awesome. I believe that I gave it the subway

task. It created it and then I have to turn the speaker down. I mean that's kind of cool. And then I had it make the map better and do everything. The

weird thing here was there was actually like an interior to these subway cars, which I don't think I've ever seen before. It put little seats in there

oriented in like more or less a realistic way. So, I was happy with that. And then it had a bunch of different enemy styles and stuff. Great. So, this

I liked. This was a fun little like low poly game. So, that is overall going to conclude the HY3 preview model review. Again, this was sponsored by

Tencent. So, thank you very much to Tencent for giving me the opportunity to put out more videos and spend more time on the channel in general.

Additionally to that, keep in mind that there is also a token plan for this. It is relatively cheap. I will have put the conversion inevitably in the

video when I did bring this up. And I will say this does work nicely in the Agentic workflow. So, this could be a pretty cool option, especially now

till it's free till May 8th. So, you can basically just pound on that API and not really have to worry about costs, which is always nice. So, that's

going to conclude today's video. If you have any questions, please feel free to leave them in the comments.