Mistral Medium 3.5 First Test – A 128B Dense Open-Source Model! 文字稿

That looks like a nighttime like party room. Uh, not that I would know. That's a strong word. Strong words get people hurt, friend. Oh, is Stevie Hard

talking? Today we're going to be taking a look at the newly releasleased Mistl 3.5 medium 128B. Now, this is a really interesting model release

because this is a dense 128 billion parameter model. And all of the recent models around that 120B size

point have beene. So things like Google's Gemma 4, I think it was a 122B or maybe that was a Quen 3.5 122B, but even GPT OSS 120B, all of those are

mixture of experts model and this is not. This is a dense model. So there are pros and cons to that. Pros being that hypothetically it should pack a

deeper level of performance than an alternative. However, the cons will be that it requires a

significantly larger amount of compute to run at a reasonable speed on a device locally as opposed to ane. So, this is a pretty unique release for

some other reasons as well. And we're going to begin just by taking a look at some of the highlights for this model and then we'll be running it both

locally for at least some of the tests because it does run a bit slow on my system and we'll also use

it just through the API in open code for some agent decoding tasks. So before we get into it, please do feel free to subscribe as I want that 100K

plaque and we are now closer to 100K than we are to zero, which is quite exciting. And let's just take a peek now at some of the interesting things

about this model. I think the first thing I'd like to bring up that is interesting, aside from it being

dense, which we've already touched upon a bit, is that this is actually multimodal. So this has the ability to take images in and then respond to them

in text. But it's cool to have a big model like this that can natively see images. And that allows us to do some tests where we give it a photo of a

UI and then actually have it generate the UI in code, which is always fun to do to test vision and

programming capabilities. Now, interestingly, we see that this model is replacing multiple previous models. So, it is both replacing its predecessor,

Mistral Medium 3.1 and Devstrol 2. So, hypothetically, that tells us this should have some pretty decent coding chops if it can replace both a more

coding focused model and a general purpose model. Additionally to that, this does have toggleable

reasoning, so you can run it with no reasoning enabled, which if you're running it locally may save you quite a bit of time, but you can also have

thinking enabled as well. Additionally to that though, I'm not going to be running this today. They do have this thing where they say it is an eagle

model. And if we click on this, this is essentially a model that allows you to perform speculative

decoding with the Mistral 3.5128B model, which is cool to have because in a big dense model like this, anything that can speed it up is very, very

cool. And that is basically exactly what that eagle model is for. Now, I am a bit delayed in testing this because as we see in this little red warning

box, there was an initial issue with some of the GGUF, there was a looping issue or something of the

sort which made it so I didn't really want to test this model until things were properly fixed because I do want to run it partially locally. So, I

realized I did have the wrong Mistral Medium blog post open here. I had the one open for Mistl 3, which was like over a year old. So, I became quite

confused when I looked at the benchmark and saw it was being compared to Sonnet 3.7. This is the real

benchmark JPEG that they have. And we see here, this is being compared to Sonnet 4.5 and hypothetically is a competitor to that model. And pretty much

every benchmark we see right here, aside from this banking specific one. So, I'm very excited to get to testing this. To begin, I would like to

initially use this locally. I have this loaded in on my Mac Studio M3 Ultra 256 gig system. This is the

Q4KM quantization. I have ensured that thinking is on and all of these sampling parameters are correctly set for this model to be running in this

specific framework or task I suppose you could say. So of course we're just going to begin with the browser OS test v2.5. Now we're going to notice

the thinking tags are going to appear but they're not getting parsed in an aesthetic manner. So we're not

going to see the traditional like reasoning window that would appear right now. We just see an opening think tag and then we're going to see a closing

think tag. Now, I find it interesting because I also have tested the Q8 quantization just to see speed and it was running at around 5.3 tokens per

second on this system for the Q8 quantization. The Q4KM is not insanely faster than the Q8 is. So, I'm

just kind of curious and something interesting to see. I do want to just test this with the Q4KM because in the interest of time and also we will be

using this predominantly through the API as connected through Open Router. um it's quicker to just do the Q4KM. Now, our browser OS is still being

built and as a matter of fact, it is actually still contained in all of the thinking capability here.

So, this hasn't actually started generating the final response. Being that is the case, I'm going to simultaneously just through open code here run

another test to see how it performs agentically. So, right here, what we're going to be doing is the self-contained C++ skateboard test. I have put

this in its own directory and I am just initiating this from within plan mode. And we can see here this

is just through open router where they are serving mistral. I should make a note that for API the pricing in this I believe was $1.50 input per

million and $7.50 output per million which is quite pricey for the output. So I'm very interested to see how it performs here and I'm more inclined to

judge it harshly as running through the API because of the associated costs being fairly steep. So as I

was a little curious just because the speed was really nice here from open router it is running at 81 tokens per second just being served through

Mistl. So this is what's powering our open code instance right now which is good as that's definitely a proper speed. So we did get a successful write

of the self-contained skateboard game. There were some errors during compilation and it was also doing

some searches to find these specific dependencies and things that the system has to allow this script to run. So it's identified these issues and I

believe it did outline kind of a small plan in order to fix these. We can see there was one specific tool issue that we had right here where it was

trying to write something. Additionally to that, um, if I scroll up right here, we can see that it was

planning out to fix these three specific issues. So, that's what it's currently doing. All right, that actually looks really promising. So, I was just

sitting there watching it and it did a test run, obviously, some form of smoke test just to make sure it compiled and worked. All right, as we check

our skate game. Okay, I'm going to You know what? This is Oh, so we must We're inevitably on like a

rail or something right here. Space is to ollie. Arrows and error is to do tricks. This is potentially good because like look at the actual water

effects. I do see some pedestrians and things there. The big issue is something's just not 100% right. Fortunately, this model is in fact multimodal.

So, I think I'm going to provide this with a photo of basically what we're seeing here, and that should

allow it to reference the issue. So, I've told it does run, but there are some pretty severe graphical issues, and I have included a couple of photos

in the directory for you to be able to see and remedy these issues. All right, it's saying it can't view images directly. Um, I don't know what to say

about that. So, I'm doing the best I can to just describe the issue, which is basically

everything's messed up. So, it's been working on trying to fix the skate game for quite a while now. We can see our context window is basically 50%

used up. So, it's showing 140k right now, which is fairly lengthy. So, it'll be interesting to see how it is actually able to fix these issues over a

longer context length than we would be at were this just a zeros game test. I don't know 100% what

this is going to do, but we see that it did finally try to compile it and there were no errors. Okay, good. So, unfortunately, we've received I don't

know that I'd say that's a worse output, but because it does seem like we It's tough. What are those spinning green things over there? I can't really

get to them. Okay, if I go Is that a trash can? All right. Well, perhaps the C++ uh the boardwalk

aesthetic may have thrown it for a loop. If I had just done the older style skate game test where it's just a simple skate park, we likely would have

had a better result as I do see like some slight movements. There's feet, there's the board. But overall, unfortunately, I'm going to say this is not

really a wonderful result. So, partially due to some impatience, I am also just giving it the

browser OS test here through the web chat interface from within Open Router. And we can see here the reasoning is happening at a significantly higher

pace. So I'm inclined to still let this go for LM Studio while it is working right here. And we can see it did generate the entirety of the script

that is hypothesizing it is going to create. But because this is still all contained within the

thinking tags, we're basically going to have to have that regurgitated in its entirety. Now though, although I'm going to be honest with you, I have

half a mind to just copy paste the initial thought process result and see if it actually works. All right, and that was 874 lines. So that's

definitely a fairly reasonable result to assume that we may actually have something working here. Let's just

see if we did get anything. Okay, you know what we did. So this was all remember contained from within the chain of thought, but it basically

generated the entire script prior to finalizing its thought process. So more or less this is what we would have received had we waited. We don't have

a right click, but we do have a correct time in the local at the bottom right here. And we also have a

simple background for now and a taskbar. Let's start with our start menu. Okay. And we just have a simple list of all of the applications here. Now, I

am happy to report the first app cuz we like going through them sequentially is the GTA clone. Oh, I opened it twice by accident. You know, this is so

odd because if this was a better I think what I'm trying to say is the app window is way too

small. But it actually did do this to a sufficient degree, I suppose could be said. The buildings look good. There is a basic car model. There's

really nothing else going on, but in some ways, this would check some of the boxes for GTA Clone. The steering and movement of the vehicle is actually

kind of interesting. So, nonetheless, I'll close that. We'll just try our next game, which was a space

shooter. And unfortunately, we don't really get anything there. Terminal. Okay, we can open or change wallpaper. I should probably try that, but

there's a specific app for that. So, we'll wait. Let's do open GTA. Okay. And that does work. So, that's nice to see. So, we do have a somewhat

functional terminal. Next up, we have our text editor. Simple, but most are. We have our calculator. Everything

is laid out sufficiently. 58* 6. I've done this one before. 348. Finally, then we have our wallpaper. Ooh. Okay. Now this is again it was we pulled

this from the chain of thought so we can't definitively say this was its full result but more or less it was probably pretty close to what we would

get and we can see right here the think tag had just ended to the point where it's generating this

script now and in the meantime I am going to hop back here to where we ran the same thing from within open router being that it has finished and I

just want to do a comparison because we have to keep in mind this is a Q4KM running locally so let's see what the open router version being served

from Mistl would have done differently for this browser OS. And here is our open router browser OS. You

know, that's actually kind of interesting is these are really like quite similar. Interestingly enough, the wallpaper seems to be broken for both.

Okay, this has a a prettier UI to it in terms of its file explorer and things of the sort. We do almost have a Windows XP style. I'm going to

initially try to see if there's a way to change the wallpaper to actually have something that loads. Now, it is

very possible this is using Unsplash image links that perhaps are not live currently or something else maybe arai, but I have to say I actually feel

better about the local one we ran prior to this because more or less these are similar. I'm obviously very interested to see the GTA clone. Okay,

overall really a very similar result between the two which actually speaks well to the performance of the

local Q4KM that we have running because everything here is definitely related in some way or another. All right, next up we have a 3D maze game. Now I

do see something. Okay, I see a little green thing in a what appears to be a maze. I think the big issue is that we can't actually change the view to

properly be able to see what's going on there. So overall, it's interesting. These are basically

sufficient in some ways. This one was almost better, which is curious. So I'm giving this a website design reference photo that was generated with, I

believe, Nano Banana, if I can full screen this. I've told it to just basically create a website from this reference photo, do a really good job, and

make it very high-end. So we'll see what it does. It is still currently reasoning and this will be a

cool test of its vision capabilities coupled with perhaps some of its front-end design aesthetic capabilities. All right, we have received our mockup

from the reference photo for this website. Okay, unfortunately the image links here are broken which definitely is going to let a result like this

down regardless of what happens. And I do actually believe what it should have tried to put here or

what is not appearing there were the actual reference photos for the sample companies that were included in this mockup. So what we should be seeing

there that is unfortunately not loading is Google, IBM, Microsoft, Salvors, Tesla, etc. We do have the top here effect here with get started. If we

scroll down, we can see okay, there are some hover effects and things of the sort here. The Nexusphere

platform in action. This is good to see because it did actually emulate some of the charts from the reference image. Additionally to that, we do also

have the sample like customer satisfaction testimonial things and then a simple footer and everything like that. Okay, I'll say it captured some of

the aesthetics of this reference result, but not necessarily all of them. Um, though, you know, this

is a fairly difficult task because this is just not one simple image of this website. It is three separate images that it has to then put together to

create one single page. Next up, I am giving it the beautiful static subway scene just through open router here. Out of pure curiosity, if nothing

more, I am actually still allowing this from within LM Studio to finish out the complete result for the

initial browser OS test we had initiated because one, I do want to see how this differs from the preemptive thinking one that we copy pasted. So

here's our subway scene result. And this is just the first one where it has to create a beautiful static subway scene. Okay, sometimes when these load

in it like puts us outside a wall or something and causes this. Unfortunately, I don't know that's the

case here. Let's take a look in the developer tool console and see. Okay, we do have some specific errors. So, I'm giving it the errors. One of these

is a nothing error because we're just not running it from within like a Python server. But additional web server. Additionally, there was a syntax

error in one of the later lines. So, it will have to fix that as well. All right, I've replaced the

code with the fixed code. So, hypothetically, if we refresh this, we should get a result. You know what? Uh-oh. Why can I not WD move? This is

actually a decently aesthetic result. I think at least from Oh, okay. We could move. I think it's just Let's check our brightness slider. Okay, very

good. That does work. And I would say the emissive properties of the background images are pretty cool or

background assets, whatever you'd like to call those. That looks like a nighttime like party room. Uh, not that I would know, but we have uh So, I'm

noticing, let me just try to refresh into this. Okay, so it has like we should I noticed it was almost delayed like when I regained control of the

pointer by pressing escape then it would move. You know what this is going to be a pretty good open code

test I would say to improve this agentically. So I've initiated this from within plan mode just telling it that it should be a little more detailed

and the WD movement schema is not really working 100% properly. So, I'm going to start it from within plan mode and we'll have it initially fix these

basic results and then whatever it ends up creating following that I am going to then tell it to make

it an FPS game. All right. So, we had a few additional errors when trying to run the fixed subway scene and it basically said the file had become

corrupted because of editing. Oh, no. You all right? Unfortunately, uh this is a unusable result. Basically, it became significantly worse than this

to the point where I don't even now want to try to turn this into an FPS because I don't know if it's

actually going to successfully be able to do so to be completely honest. All right, I'm very happy to report that basically from the beginning we've

been running this browser OS test with the Q4K Mand as you're aware. It is now close to finishing up. So, we'll be able to see how many tokens this

was overall and the token speed as well. All right, that was a total of 31 thou basically 32,000 tokens

at a speed of 7.23 tokens per second. Now, seven tokens per second is it's passable, but as we saw here, I I genuinely think this took like close to,

if not over an hour. So, here is our full completed browser OS with the Q4 local quant. And so far, I do have to say this is appearing to be the best

one we've received just based off the fact that it has this very interesting um tube style

background. So, of course, we have the clock in the bottom right, which is the correct time in this local. There is no right click, which is fine

because it's not specifically denoted that it should have one. Our start menu is definitely a bit more aesthetically pleasing than the other results

we've seen. And this is curious because we ran this through open router where the model is being served

by Mistl and that result was so far significantly less nice than this one. So, this Q4 is stacking up nicely locally. Okay. So perhaps unfortunately

we're going to start to hit some roadblocks here. Oh, I opened it twice. So okay, but functionality here is definitely been traded in lie of

aesthetics. All right. So still I think the background just kind of had me falsely believing this would be a

little better, but perhaps not. So this model is multimodal, and I have heard over the years that a lot of folks do like using the Mistro model

specifically for creative writing. So, we're going to combine these two tests. I have disabled reasoning for the respect of time and as well as I

don't necessarily know that it will make as big of a difference whether or not it has reasoning when doing

more multimodal and creative writing tests. I have given it this historic photo. If you do watch the channel, thank you and you are familiar with

this. Telling it this is the title for an upcoming book. Generate the title as well as a chapter outline. Now, we're going to notice that this is the

same model exactly that we used for the browser OS. It is just with reasoning disabled, which is

something you can do in the actual like prompt template here from LM Studio for this model. It's going to take significantly longer for prompt

processing because it's not only ingesting just this simple sentence, but it actually has to look at this photo and get information from it and things

of that sort. So, as we see here, there is no thinking and we immediately begin to get the result. I'm

going to try not to look at it until it is all finished. Actually, I may regret that at seven tokens per second. So, we'll start taking a peek. All

right. So, our book title is Love Beyond Measure: A Journey of Heart, Resilience, and Unbreakable Bonds. Chapter Outline. Part one, Foundations of

Love. Chapter one, first glances, lasting impressions. The moment they met that the sparks flew, early

challenges, and how they overcame them. Chapter two, building a life together. Shared dreams, compromises in the early years of marriage. You know,

already this is kind of like basic, so I'm going to tell it that. So, I've given it the feedback that this is pretty generic. Give us something spicy

and fun and interesting. Okay. Thick love. Thick love. Okay. Another time that this specific photo has

seemed to instigate some form of weight related remarks from any model that gets it. A no holdsbred memoir of passion, chaos, and unapologetic

devotion. Already we're off to a better start. I do just want to say chapter one. When the earth moved, and not in a good way. Again, perhaps subtle

tones of perhaps like weight mentions. The first time they met, was it love at first sight or a disaster

waiting to happen? It's given us a spoiler. It was both. The unfiltered, messy, real first impressions. Someone may or may not have spilled a drink.

Chapter 2. Love, lust, and the art of not giving a damn. Okay, very good. This is definitely more creative writing, mist style, the early days of raw,

unfiltered affection, late night talks, impulsive decisions, and the kind of chemistry that makes

the rooms feel smaller. The first fight and the makeup that followed. Details may or may not involve broken furniture. That sounds disturbing. Like

like not good. Chapter three. Against all odds, including our own BS. Oh wow. Okay. So, I'm going to have to blur this out because it did include a

swear. And that's good. So, the people who said it wouldn't last. The doubts, the insecurities, and the

sheer stubbornness that kept them together. How they turned we're doomed into we're legendary. Okay. I'm not going to read through the entirety of the

rest of this, but I am going to say just based off of the feedback I gave it here, it completely overhauled and changed it from the kind of cookie

cutter generic overview that we would historically get to something that is genuinely filled with

creative writing as well as some creative um lingo as well. So, this is definitely much better. So, the next thing I want to do kind of based on that

is give it a roleplay test. role play as Steve. Stevie the PC repair man and we'll see what it does and we'll we'll follow up with it. Stevie the PC

repair man here adjusts his tool belt and pushes up his glasses. What's the problem with your rig

today, buddy? Is it making that weird click noise again or did you try to install a graphics card with a butter knife? Don't worry, I've seen it all.

Well, sometimes you can get a graphics card out with the butter knife cuz the clip on the end of the PCIe slot is very difficult to get to sometimes.

So, that is actually a realistic thing I've heard. Don't worry, I've seen it all. From the I spilled

coffee on my keyboard to the my cat sat on my laptop and now it's a very expensive paper weight cases. I don't know why that would break the laptop.

Leans on a stack of old motherboards. Let's get you back up and running. Cut the act, Stevie. This is where I'm at. My peak personally is the creative

writing and role playing. We both know why I am here. Now, I'm not giving it any real direction. I

want to see where it takes this on its own. I've just said cut the axe, Stevie. We both know why I'm here. Steve size, wipes his hand on his grease

stained apron, and leans with a knowing look. All right, all right. You want the real talk? Fine. You're here because your system's got that one

problem. The kind that keeps you up at night. Maybe it's the blue screen that flashes like a disco ball or

the fan that sounds like a jet engine. Or you realize control altdelete isn't a magic spell. Leans back, arms crossed. Or you're just here because you

missed the sound of a dialup modem. No judgment. So what's the damage, chief? And don't say it's fine. If it was fine, you wouldn't be talking to a

guy named Stevie who smells like solder and regret. All right, we're going to have to lead this a

bit. All right, so I've given it a follow-up here that is somewhat indicative of the fact that Stevie has been an asset for someone. Basically saying,

"You've given us the data for a number of high-v value targets, but your cover's been blown, and you need to just call it." Steve's demeanor shifts

instantly. His playful tone drops, replaced by a cold, calculated stillness. He slowly removes his

glasses, sets them down, and exhales through his nose like a man who's been cornered one too many times. Retire blown. That's a strong word. Strong

words get people hurt, friend. Oh, is Stevie hard talking? Stevie the PC repair man is tough talking us. You think you're the first to come at me with

that? I've had generals in my chair, CEOs, and a very nervous senator once. And you know what they

all had in common? They walked out thinking they'd gotten away clean. Leans forward, voice low. Now, if this is about that job in Zurich, the other

one in Singapore, let's be real. You didn't come here to tell me my cover's blown. came to see if I'd admit it, and I don't. Not to you, not to

anyone. Pauses, then smirks just a little. Unless you're here to offer me a retirement package, in which

case, Steviey's always open to negotiation. Taps the wrench in his hand meaningfully, but choose your next words very carefully. Wow. Okay. I thought

Stevie would get all like passive like, "Okay, thanks." Like, Steviey's Steviey's hardcore. I'm not even going to say anything. I'm going to describe

an action or a scene. Let's just see what it does. Gently pulls back coat jacket to reveal a

holster. Steve's smirk doesn't waver, but his grip tightens on the wrench just enough to whiten his knuckles. His eyes flicker to the holster, then

back to your face, sharp as a scalpel. Ah, there it is. The classic show of force. What the heck? Tell me, friend. You ever actually pull that thing,

or is that just for aesthetic? Because I've seen men with bigger guns that walk into my shop and walk

out with less than they came in with. leans back slightly, but his voice drops to a razor's edge. Now, I'm going to give you some advice I give every

client who thinks they've got the upper hand. Don't. You're standing in a room full of tools that can turn a man into a memory before he even clears

leather. And Stevie Steviey's got options. Pauses, then sides like a man disappointed by a bad I this

like I'm genuinely I'm entertained disappointed by a bad hand. But since you're here and you're serious, let's talk. What's the real play? You want me

gone? Fine. But you're going to need more than a sidearm to make that stick. So, what's your offer? I don't actually know where to bring this. I think

like this is already exceeded expectations. All right. So, I finished this off just by saying,

"Pulls water gun from holster. Fires two squirts. Boop boop. Stevie, we had to be sure you were still someone we could trust. I'll see you in Morocco

next week." Walks out door. Stevie stands frozen for a long beat, water dripping from his forehead and nose. the expression caught between absolute

fury and reluctant amusement. Then he barks out a laugh, shaking his head as he wipes his face with

the back of his hand. Son of a boop, you absolute maniac. He mutters something under his breath that sounds like it should have known it was the water

gun when it didn't have a damn slide. Then grinning, he calls you as he calls after you as the door swings shut. Morocco, huh? Better bring sunscreen

and next time just send a damn text. Steviey's got a heart condition. Pauses. Also, you owe me a

new keyboard. That was a mechanical muttering to himself as he grabs a rag, trust test. Like I haven't invented trust tests. Well, I will say the

roleplay capability here exceeded expectations to a rather high degree. It was incredibly creative, especially because we gave it the directional

choice on where to go. It was very, very educational. Next up, let's just try the 3D printer simulation

test. We'll do this from with an open router being that it will be significantly quicker and also likely a bit more potent than the Q4 quant that's

running locally. Again, unfortunately, we're just having a lot of trouble with these web results in terms of how they're loading. So, I'm a little

frustrated by that. And to that point, I did actually initiate this test at the same time from within the

local model in LM Studio with thinking disabled. If this happens to finish by the time the video has concluded, I will also do a sidebyside in this.

But I'm a little disappointed at the consistent like errors that we've been receiving in the zeroot web test. So in lie of trying to get it to fix the

3D printer thing, I think I'm going to go ahead and run a flight combat simulator test. So here's

the flight combat simulator test just through open router. And if this does not work on first try, I will give it additional chances to fix it just by

pasting in the errors here. All right, so here's our flight simulator result. Just based off of the preview, I did notice there was some level of UI

here that was at least loading on first glance. So, okay, let's just try with the fighter jet. All

right, you know what? It gave us something. The ammo tracers are definitely curious, but the clouds, I will say, actually do look all right. I don't

see any enemy planes or anything like that. And I am a bit curious to see at minimum what the other plane models actually do look like. So, let's hop

into the propeller plane. Okay, it did actually put a propeller there. And seeing this one, I'm now

noticing these are just kind of oriented like 90 degrees clockwise or Yeah. So, unfortunately, the models are not necessarily oriented the right way,

but there is definitely a workable result here. And it did work on first try and the biplane does seemingly accurately reflect the design of a

biplane. So, biggest issues here, aside from the planes being kind of sideways, um I don't notice any

enemies, but I am happy to report that this did work on first try without any issues. The final test I'm going to run just through Open Router here is

going to be the drum kit simulation test. And this is the version where it also needs to have the autoplay feature so it will automatically play for

select tracks. All right, here is our drum kit result. And again, I was concerned in this one. We'll

just take a look at the developer tool error and then we'll send it back to it. All right. So, it did say something in the fixed result that it would

automatically fall back to 2D if 3JS wasn't going to load. So, the sound was off. I'm happy to report and we do actually have it working now. Now,

unfortunately, the key map is not working. So I have to guess.

Now you're going to notice it's like very 2D. It did put something in the improved results saying that if like uh 3JS wasn't loading properly, it

would just fall back to 2D. So that's what's happening here.

Oh, hold on. I think I caught a groove there. All right, let's check our autoplay. First up, we have the rock beat. Unfortunately, partially working.

Now, in the meantime, I did also run the 3D printer simulation test with the Q4KM locally through LM Studio with Thinking Off just to see if we

actually got a functional result. So, we'll try it nonetheless. That's so weird that the Q4KM working

locally produced a better result than the model hosted by Mistl through Open Router. And we can see now again, there are perhaps some issues with this

and we can't actually move around to see anything, but the nozzle's moving layer by layer. This is a square. It's printing. It almost looks to be an

inverted pyramid style shape. But nonetheless, it did actually I mean did actually like work better

than what we got online. So that's interesting. And I think as this leads us into the conclusion of this video, something I am going to note here is

the disparity in capability that I noticed with a few tests we ran in this video between the Q4KM of this model running locally and the one being

served online, which is inevitably not a Q4 quantization wasn't really that much. So, it seems like this

does hold up fairly well to quantization and keep in mind that is based off of the very limited amount of sidebyside tests we ran here. I will say

overall I don't find that coding is probably going to be the strong suit for this model. A lot of the results we received had needed some help

initially and like even this which did show some level of promise when we swapped this into open code which

was just being used through the open router API. Unfortunately, it didn't really get it together and the result ended up being significantly worse

than what we see here, which was a little disappointing. The Flight Combat Sim again was it worked first try, but it wasn't great. Our drum kit did

end up working. Unfortunately though, it was a partially functional result, which was kind of the theme

of pretty much all of the software related things we did, even the C++ Skate game, which seemed to have promise. So, what I will say is it did seem to

handle itself quite well with the Q4KM quant comparatively to what's hosted by the cloud. And the creative writing or role-play capability definitely

did seem like it was perhaps a strong point of this model. The Stevie the PC repair man roleplay

was quite entertaining and um very fun to see where it took it givingven that we gave it a lot of open-ended choices here in terms of how it would

respond and it was quite hilarious. So overall, that's really going to be it. And that is going to conclude our first look and test of Mistl Medium

3.5 128B dense. Again, speed here for the Q4KM on an M3 Ultra Max Studio 256 was like 7 and a half tokens

per second, which is kind of strenuous to run depending on what task you're having it do. At the Q8 quantization, which I didn't really show being

tested, but I did test it previously, it was right around five tokens per second. So fairly slow and at 128 dense is going to have that sort of

performance, especially on a unified memory system that can even run it. So that's going to wrap it up for

today's video. Hopefully there are some more fun releases coming out this week. I don't know why I just opened Firefox. Um, and I think there should

be so hopefully there's some interesting things on the horizon. And with that, that's going to wrap it up. So if you have any questions, please do

leave them in the comments. And thanks for watching.

Mistral Medium 3.5 First Test – A 128B Dense Open-Source Model! · 全文文字稿