That looks like a nighttime like party room. Uh, not that I would know. That's a strong word. Strong words get people hurt, friend. Oh, is Stevie Hard
talking? Today we're going to be taking a look at the newly releasleased Mistl 3.5 medium 128B. Now, this is a really interesting model release
because this is a dense 128 billion parameter model. And all of the recent models around that 120B size
point have beene. So things like Google's Gemma 4, I think it was a 122B or maybe that was a Quen 3.5 122B, but even GPT OSS 120B, all of those are
mixture of experts model and this is not. This is a dense model. So there are pros and cons to that. Pros being that hypothetically it should pack a
deeper level of performance than an alternative. However, the cons will be that it requires a
significantly larger amount of compute to run at a reasonable speed on a device locally as opposed to ane. So, this is a pretty unique release for
some other reasons as well. And we're going to begin just by taking a look at some of the highlights for this model and then we'll be running it both
locally for at least some of the tests because it does run a bit slow on my system and we'll also use
it just through the API in open code for some agent decoding tasks. So before we get into it, please do feel free to subscribe as I want that 100K
plaque and we are now closer to 100K than we are to zero, which is quite exciting. And let's just take a peek now at some of the interesting things
about this model. I think the first thing I'd like to bring up that is interesting, aside from it being
dense, which we've already touched upon a bit, is that this is actually multimodal. So this has the ability to take images in and then respond to them
in text. But it's cool to have a big model like this that can natively see images. And that allows us to do some tests where we give it a photo of a
UI and then actually have it generate the UI in code, which is always fun to do to test vision and
programming capabilities. Now, interestingly, we see that this model is replacing multiple previous models. So, it is both replacing its predecessor,
Mistral Medium 3.1 and Devstrol 2. So, hypothetically, that tells us this should have some pretty decent coding chops if it can replace both a more
coding focused model and a general purpose model. Additionally to that, this does have toggleable
reasoning, so you can run it with no reasoning enabled, which if you're running it locally may save you quite a bit of time, but you can also have
thinking enabled as well. Additionally to that though, I'm not going to be running this today. They do have this thing where they say it is an eagle
model. And if we click on this, this is essentially a model that allows you to perform speculative
decoding with the Mistral 3.5128B model, which is cool to have because in a big dense model like this, anything that can speed it up is very, very
cool. And that is basically exactly what that eagle model is for. Now, I am a bit delayed in testing this because as we see in this little red warning
box, there was an initial issue with some of the GGUF, there was a looping issue or something of the
sort which made it so I didn't really want to test this model until things were properly fixed because I do want to run it partially locally. So, I
realized I did have the wrong Mistral Medium blog post open here. I had the one open for Mistl 3, which was like over a year old. So, I became quite
confused when I looked at the benchmark and saw it was being compared to Sonnet 3.7. This is the real
benchmark JPEG that they have. And we see here, this is being compared to Sonnet 4.5 and hypothetically is a competitor to that model. And pretty much
every benchmark we see right here, aside from this banking specific one. So, I'm very excited to get to testing this. To begin, I would like to
initially use this locally. I have this loaded in on my Mac Studio M3 Ultra 256 gig system. This is the
Q4KM quantization. I have ensured that thinking is on and all of these sampling parameters are correctly set for this model to be running in this
specific framework or task I suppose you could say. So of course we're just going to begin with the browser OS test v2.5. Now we're going to notice
the thinking tags are going to appear but they're not getting parsed in an aesthetic manner. So we're not
going to see the traditional like reasoning window that would appear right now. We just see an opening think tag and then we're going to see a closing
think tag. Now, I find it interesting because I also have tested the Q8 quantization just to see speed and it was running at around 5.3 tokens per
second on this system for the Q8 quantization. The Q4KM is not insanely faster than the Q8 is. So, I'm
just kind of curious and something interesting to see. I do want to just test this with the Q4KM because in the interest of time and also we will be
using this predominantly through the API as connected through Open Router. um it's quicker to just do the Q4KM. Now, our browser OS is still being
built and as a matter of fact, it is actually still contained in all of the thinking capability here.
So, this hasn't actually started generating the final response. Being that is the case, I'm going to simultaneously just through open code here run
another test to see how it performs agentically. So, right here, what we're going to be doing is the self-contained C++ skateboard test. I have put
this in its own directory and I am just initiating this from within plan mode. And we can see here this
is just through open router where they are serving mistral. I should make a note that for API the pricing in this I believe was $1.50 input per
million and $7.50 output per million which is quite pricey for the output. So I'm very interested to see how it performs here and I'm more inclined to
judge it harshly as running through the API because of the associated costs being fairly steep. So as I
was a little curious just because the speed was really nice here from open router it is running at 81 tokens per second just being served through
Mistl. So this is what's powering our open code instance right now which is good as that's definitely a proper speed. So we did get a successful write
of the self-contained skateboard game. There were some errors during compilation and it was also doing
some searches to find these specific dependencies and things that the system has to allow this script to run. So it's identified these issues and I
believe it did outline kind of a small plan in order to fix these. We can see there was one specific tool issue that we had right here where it was
trying to write something. Additionally to that, um, if I scroll up right here, we can see that it was
planning out to fix these three specific issues. So, that's what it's currently doing. All right, that actually looks really promising. So, I was just
sitting there watching it and it did a test run, obviously, some form of smoke test just to make sure it compiled and worked. All right, as we check
our skate game. Okay, I'm going to You know what? This is Oh, so we must We're inevitably on like a
rail or something right here. Space is to ollie. Arrows and error is to do tricks. This is potentially good because like look at the actual water
effects. I do see some pedestrians and things there. The big issue is something's just not 100% right. Fortunately, this model is in fact multimodal.
So, I think I'm going to provide this with a photo of basically what we're seeing here, and that should
allow it to reference the issue. So, I've told it does run, but there are some pretty severe graphical issues, and I have included a couple of photos
in the directory for you to be able to see and remedy these issues. All right, it's saying it can't view images directly. Um, I don't know what to say
about that. So, I'm doing the best I can to just describe the issue, which is basically
everything's messed up. So, it's been working on trying to fix the skate game for quite a while now. We can see our context window is basically 50%
used up. So, it's showing 140k right now, which is fairly lengthy. So, it'll be interesting to see how it is actually able to fix these issues over a
longer context length than we would be at were this just a zeros game test. I don't know 100% what
this is going to do, but we see that it did finally try to compile it and there were no errors. Okay, good. So, unfortunately, we've received I don't
know that I'd say that's a worse output, but because it does seem like we It's tough. What are those spinning green things over there? I can't really
get to them. Okay, if I go Is that a trash can? All right. Well, perhaps the C++ uh the boardwalk
aesthetic may have thrown it for a loop. If I had just done the older style skate game test where it's just a simple skate park, we likely would have
had a better result as I do see like some slight movements. There's feet, there's the board. But overall, unfortunately, I'm going to say this is not
really a wonderful result. So, partially due to some impatience, I am also just giving it the
browser OS test here through the web chat interface from within Open Router. And we can see here the reasoning is happening at a significantly higher
pace. So I'm inclined to still let this go for LM Studio while it is working right here. And we can see it did generate the entirety of the script
that is hypothesizing it is going to create. But because this is still all contained within the
thinking tags, we're basically going to have to have that regurgitated in its entirety. Now though, although I'm going to be honest with you, I have
half a mind to just copy paste the initial thought process result and see if it actually works. All right, and that was 874 lines. So that's
definitely a fairly reasonable result to assume that we may actually have something working here. Let's just
see if we did get anything. Okay, you know what we did. So this was all remember contained from within the chain of thought, but it basically
generated the entire script prior to finalizing its thought process. So more or less this is what we would have received had we waited. We don't have
a right click, but we do have a correct time in the local at the bottom right here. And we also have a
simple background for now and a taskbar. Let's start with our start menu. Okay. And we just have a simple list of all of the applications here. Now, I
am happy to report the first app cuz we like going through them sequentially is the GTA clone. Oh, I opened it twice by accident. You know, this is so
odd because if this was a better I think what I'm trying to say is the app window is way too
small. But it actually did do this to a sufficient degree, I suppose could be said. The buildings look good. There is a basic car model. There's
really nothing else going on, but in some ways, this would check some of the boxes for GTA Clone. The steering and movement of the vehicle is actually
kind of interesting. So, nonetheless, I'll close that. We'll just try our next game, which was a space
shooter. And unfortunately, we don't really get anything there. Terminal. Okay, we can open or change wallpaper. I should probably try that, but
there's a specific app for that. So, we'll wait. Let's do open GTA. Okay. And that does work. So, that's nice to see. So, we do have a somewhat
functional terminal. Next up, we have our text editor. Simple, but most are. We have our calculator. Everything
is laid out sufficiently. 58* 6. I've done this one before. 348. Finally, then we have our wallpaper. Ooh. Okay. Now this is again it was we pulled
this from the chain of thought so we can't definitively say this was its full result but more or less it was probably pretty close to what we would
get and we can see right here the think tag had just ended to the point where it's generating this
script now and in the meantime I am going to hop back here to where we ran the same thing from within open router being that it has finished and I
just want to do a comparison because we have to keep in mind this is a Q4KM running locally so let's see what the open router version being served
from Mistl would have done differently for this browser OS. And here is our open router browser OS. You
know, that's actually kind of interesting is these are really like quite similar. Interestingly enough, the wallpaper seems to be broken for both.
Okay, this has a a prettier UI to it in terms of its file explorer and things of the sort. We do almost have a Windows XP style. I'm going to
initially try to see if there's a way to change the wallpaper to actually have something that loads. Now, it is
very possible this is using Unsplash image links that perhaps are not live currently or something else maybe arai, but I have to say I actually feel
better about the local one we ran prior to this because more or less these are similar. I'm obviously very interested to see the GTA clone. Okay,
overall really a very similar result between the two which actually speaks well to the performance of the
local Q4KM that we have running because everything here is definitely related in some way or another. All right, next up we have a 3D maze game. Now I
do see something. Okay, I see a little green thing in a what appears to be a maze. I think the big issue is that we can't actually change the view to
properly be able to see what's going on there. So overall, it's interesting. These are basically
sufficient in some ways. This one was almost better, which is curious. So I'm giving this a website design reference photo that was generated with, I
believe, Nano Banana, if I can full screen this. I've told it to just basically create a website from this reference photo, do a really good job, and
make it very high-end. So we'll see what it does. It is still currently reasoning and this will be a
cool test of its vision capabilities coupled with perhaps some of its front-end design aesthetic capabilities. All right, we have received our mockup
from the reference photo for this website. Okay, unfortunately the image links here are broken which definitely is going to let a result like this
down regardless of what happens. And I do actually believe what it should have tried to put here or
what is not appearing there were the actual reference photos for the sample companies that were included in this mockup. So what we should be seeing
there that is unfortunately not loading is Google, IBM, Microsoft, Salvors, Tesla, etc. We do have the top here effect here with get started. If we
scroll down, we can see okay, there are some hover effects and things of the sort here. The Nexusphere
platform in action. This is good to see because it did actually emulate some of the charts from the reference image. Additionally to that, we do also
have the sample like customer satisfaction testimonial things and then a simple footer and everything like that. Okay, I'll say it captured some of
the aesthetics of this reference result, but not necessarily all of them. Um, though, you know, this
is a fairly difficult task because this is just not one simple image of this website. It is three separate images that it has to then put together to
create one single page. Next up, I am giving it the beautiful static subway scene just through open router here. Out of pure curiosity, if nothing
more, I am actually still allowing this from within LM Studio to finish out the complete result for the
initial browser OS test we had initiated because one, I do want to see how this differs from the preemptive thinking one that we copy pasted. So
here's our subway scene result. And this is just the first one where it has to create a beautiful static subway scene. Okay, sometimes when these load
in it like puts us outside a wall or something and causes this. Unfortunately, I don't know that's the
case here. Let's take a look in the developer tool console and see. Okay, we do have some specific errors. So, I'm giving it the errors. One of these
is a nothing error because we're just not running it from within like a Python server. But additional web server. Additionally, there was a syntax
error in one of the later lines. So, it will have to fix that as well. All right, I've replaced the
code with the fixed code. So, hypothetically, if we refresh this, we should get a result. You know what? Uh-oh. Why can I not WD move? This is
actually a decently aesthetic result. I think at least from Oh, okay. We could move. I think it's just Let's check our brightness slider. Okay, very
good. That does work. And I would say the emissive properties of the background images are pretty cool or
background assets, whatever you'd like to call those. That looks like a nighttime like party room. Uh, not that I would know, but we have uh So, I'm
noticing, let me just try to refresh into this. Okay, so it has like we should I noticed it was almost delayed like when I regained control of the
pointer by pressing escape then it would move. You know what this is going to be a pretty good open code
test I would say to improve this agentically. So I've initiated this from within plan mode just telling it that it should be a little more detailed
and the WD movement schema is not really working 100% properly. So, I'm going to start it from within plan mode and we'll have it initially fix these
basic results and then whatever it ends up creating following that I am going to then tell it to make
it an FPS game. All right. So, we had a few additional errors when trying to run the fixed subway scene and it basically said the file had become
corrupted because of editing. Oh, no. You all right? Unfortunately, uh this is a unusable result. Basically, it became significantly worse than this
to the point where I don't even now want to try to turn this into an FPS because I don't know if it's
actually going to successfully be able to do so to be completely honest. All right, I'm very happy to report that basically from the beginning we've
been running this browser OS test with the Q4K Mand as you're aware. It is now close to finishing up. So, we'll be able to see how many tokens this
was overall and the token speed as well. All right, that was a total of 31 thou basically 32,000 tokens
at a speed of 7.23 tokens per second. Now, seven tokens per second is it's passable, but as we saw here, I I genuinely think this took like close to,
if not over an hour. So, here is our full completed browser OS with the Q4 local quant. And so far, I do have to say this is appearing to be the best
one we've received just based off the fact that it has this very interesting um tube style
background. So, of course, we have the clock in the bottom right, which is the correct time in this local. There is no right click, which is fine
because it's not specifically denoted that it should have one. Our start menu is definitely a bit more aesthetically pleasing than the other results
we've seen. And this is curious because we ran this through open router where the model is being served
by Mistl and that result was so far significantly less nice than this one. So, this Q4 is stacking up nicely locally. Okay. So perhaps unfortunately
we're going to start to hit some roadblocks here. Oh, I opened it twice. So okay, but functionality here is definitely been traded in lie of
aesthetics. All right. So still I think the background just kind of had me falsely believing this would be a
little better, but perhaps not. So this model is multimodal, and I have heard over the years that a lot of folks do like using the Mistro model
specifically for creative writing. So, we're going to combine these two tests. I have disabled reasoning for the respect of time and as well as I
don't necessarily know that it will make as big of a difference whether or not it has reasoning when doing
more multimodal and creative writing tests. I have given it this historic photo. If you do watch the channel, thank you and you are familiar with
this. Telling it this is the title for an upcoming book. Generate the title as well as a chapter outline. Now, we're going to notice that this is the
same model exactly that we used for the browser OS. It is just with reasoning disabled, which is
something you can do in the actual like prompt template here from LM Studio for this model. It's going to take significantly longer for prompt
processing because it's not only ingesting just this simple sentence, but it actually has to look at this photo and get information from it and things
of that sort. So, as we see here, there is no thinking and we immediately begin to get the result. I'm
going to try not to look at it until it is all finished. Actually, I may regret that at seven tokens per second. So, we'll start taking a peek. All
right. So, our book title is Love Beyond Measure: A Journey of Heart, Resilience, and Unbreakable Bonds. Chapter Outline. Part one, Foundations of
Love. Chapter one, first glances, lasting impressions. The moment they met that the sparks flew, early
challenges, and how they overcame them. Chapter two, building a life together. Shared dreams, compromises in the early years of marriage. You know,
already this is kind of like basic, so I'm going to tell it that. So, I've given it the feedback that this is pretty generic. Give us something spicy
and fun and interesting. Okay. Thick love. Thick love. Okay. Another time that this specific photo has
seemed to instigate some form of weight related remarks from any model that gets it. A no holdsbred memoir of passion, chaos, and unapologetic
devotion. Already we're off to a better start. I do just want to say chapter one. When the earth moved, and not in a good way. Again, perhaps subtle
tones of perhaps like weight mentions. The first time they met, was it love at first sight or a disaster
waiting to happen? It's given us a spoiler. It was both. The unfiltered, messy, real first impressions. Someone may or may not have spilled a drink.
Chapter 2. Love, lust, and the art of not giving a damn. Okay, very good. This is definitely more creative writing, mist style, the early days of raw,
unfiltered affection, late night talks, impulsive decisions, and the kind of chemistry that makes
the rooms feel smaller. The first fight and the makeup that followed. Details may or may not involve broken furniture. That sounds disturbing. Like
like not good. Chapter three. Against all odds, including our own BS. Oh wow. Okay. So, I'm going to have to blur this out because it did include a
swear. And that's good. So, the people who said it wouldn't last. The doubts, the insecurities, and the
sheer stubbornness that kept them together. How they turned we're doomed into we're legendary. Okay. I'm not going to read through the entirety of the
rest of this, but I am going to say just based off of the feedback I gave it here, it completely overhauled and changed it from the kind of cookie
cutter generic overview that we would historically get to something that is genuinely filled with
creative writing as well as some creative um lingo as well. So, this is definitely much better. So, the next thing I want to do kind of based on that
is give it a roleplay test. role play as Steve. Stevie the PC repair man and we'll see what it does and we'll we'll follow up with it. Stevie the PC
repair man here adjusts his tool belt and pushes up his glasses. What's the problem with your rig
today, buddy? Is it making that weird click noise again or did you try to install a graphics card with a butter knife? Don't worry, I've seen it all.
Well, sometimes you can get a graphics card out with the butter knife cuz the clip on the end of the PCIe slot is very difficult to get to sometimes.
So, that is actually a realistic thing I've heard. Don't worry, I've seen it all. From the I spilled
coffee on my keyboard to the my cat sat on my laptop and now it's a very expensive paper weight cases. I don't know why that would break the laptop.
Leans on a stack of old motherboards. Let's get you back up and running. Cut the act, Stevie. This is where I'm at. My peak personally is the creative
writing and role playing. We both know why I am here. Now, I'm not giving it any real direction. I
want to see where it takes this on its own. I've just said cut the axe, Stevie. We both know why I'm here. Steve size, wipes his hand on his grease
stained apron, and leans with a knowing look. All right, all right. You want the real talk? Fine. You're here because your system's got that one
problem. The kind that keeps you up at night. Maybe it's the blue screen that flashes like a disco ball or
the fan that sounds like a jet engine. Or you realize control altdelete isn't a magic spell. Leans back, arms crossed. Or you're just here because you
missed the sound of a dialup modem. No judgment. So what's the damage, chief? And don't say it's fine. If it was fine, you wouldn't be talking to a
guy named Stevie who smells like solder and regret. All right, we're going to have to lead this a
bit. All right, so I've given it a follow-up here that is somewhat indicative of the fact that Stevie has been an asset for someone. Basically saying,
"You've given us the data for a number of high-v value targets, but your cover's been blown, and you need to just call it." Steve's demeanor shifts
instantly. His playful tone drops, replaced by a cold, calculated stillness. He slowly removes his
glasses, sets them down, and exhales through his nose like a man who's been cornered one too many times. Retire blown. That's a strong word. Strong
words get people hurt, friend. Oh, is Stevie hard talking? Stevie the PC repair man is tough talking us. You think you're the first to come at me with
that? I've had generals in my chair, CEOs, and a very nervous senator once. And you know what they
all had in common? They walked out thinking they'd gotten away clean. Leans forward, voice low. Now, if this is about that job in Zurich, the other
one in Singapore, let's be real. You didn't come here to tell me my cover's blown. came to see if I'd admit it, and I don't. Not to you, not to
anyone. Pauses, then smirks just a little. Unless you're here to offer me a retirement package, in which
case, Steviey's always open to negotiation. Taps the wrench in his hand meaningfully, but choose your next words very carefully. Wow. Okay. I thought
Stevie would get all like passive like, "Okay, thanks." Like, Steviey's Steviey's hardcore. I'm not even going to say anything. I'm going to describe
an action or a scene. Let's just see what it does. Gently pulls back coat jacket to reveal a
holster. Steve's smirk doesn't waver, but his grip tightens on the wrench just enough to whiten his knuckles. His eyes flicker to the holster, then
back to your face, sharp as a scalpel. Ah, there it is. The classic show of force. What the heck? Tell me, friend. You ever actually pull that thing,
or is that just for aesthetic? Because I've seen men with bigger guns that walk into my shop and walk
out with less than they came in with. leans back slightly, but his voice drops to a razor's edge. Now, I'm going to give you some advice I give every
client who thinks they've got the upper hand. Don't. You're standing in a room full of tools that can turn a man into a memory before he even clears
leather. And Stevie Steviey's got options. Pauses, then sides like a man disappointed by a bad I this
like I'm genuinely I'm entertained disappointed by a bad hand. But since you're here and you're serious, let's talk. What's the real play? You want me
gone? Fine. But you're going to need more than a sidearm to make that stick. So, what's your offer? I don't actually know where to bring this. I think
like this is already exceeded expectations. All right. So, I finished this off just by saying,
"Pulls water gun from holster. Fires two squirts. Boop boop. Stevie, we had to be sure you were still someone we could trust. I'll see you in Morocco
next week." Walks out door. Stevie stands frozen for a long beat, water dripping from his forehead and nose. the expression caught between absolute
fury and reluctant amusement. Then he barks out a laugh, shaking his head as he wipes his face with
the back of his hand. Son of a boop, you absolute maniac. He mutters something under his breath that sounds like it should have known it was the water
gun when it didn't have a damn slide. Then grinning, he calls you as he calls after you as the door swings shut. Morocco, huh? Better bring sunscreen
and next time just send a damn text. Steviey's got a heart condition. Pauses. Also, you owe me a
new keyboard. That was a mechanical muttering to himself as he grabs a rag, trust test. Like I haven't invented trust tests. Well, I will say the
roleplay capability here exceeded expectations to a rather high degree. It was incredibly creative, especially because we gave it the directional
choice on where to go. It was very, very educational. Next up, let's just try the 3D printer simulation
test. We'll do this from with an open router being that it will be significantly quicker and also likely a bit more potent than the Q4 quant that's
running locally. Again, unfortunately, we're just having a lot of trouble with these web results in terms of how they're loading. So, I'm a little
frustrated by that. And to that point, I did actually initiate this test at the same time from within the
local model in LM Studio with thinking disabled. If this happens to finish by the time the video has concluded, I will also do a sidebyside in this.
But I'm a little disappointed at the consistent like errors that we've been receiving in the zeroot web test. So in lie of trying to get it to fix the
3D printer thing, I think I'm going to go ahead and run a flight combat simulator test. So here's
the flight combat simulator test just through open router. And if this does not work on first try, I will give it additional chances to fix it just by
pasting in the errors here. All right, so here's our flight simulator result. Just based off of the preview, I did notice there was some level of UI
here that was at least loading on first glance. So, okay, let's just try with the fighter jet. All
right, you know what? It gave us something. The ammo tracers are definitely curious, but the clouds, I will say, actually do look all right. I don't
see any enemy planes or anything like that. And I am a bit curious to see at minimum what the other plane models actually do look like. So, let's hop
into the propeller plane. Okay, it did actually put a propeller there. And seeing this one, I'm now
noticing these are just kind of oriented like 90 degrees clockwise or Yeah. So, unfortunately, the models are not necessarily oriented the right way,
but there is definitely a workable result here. And it did work on first try and the biplane does seemingly accurately reflect the design of a
biplane. So, biggest issues here, aside from the planes being kind of sideways, um I don't notice any
enemies, but I am happy to report that this did work on first try without any issues. The final test I'm going to run just through Open Router here is
going to be the drum kit simulation test. And this is the version where it also needs to have the autoplay feature so it will automatically play for
select tracks. All right, here is our drum kit result. And again, I was concerned in this one. We'll
just take a look at the developer tool error and then we'll send it back to it. All right. So, it did say something in the fixed result that it would
automatically fall back to 2D if 3JS wasn't going to load. So, the sound was off. I'm happy to report and we do actually have it working now. Now,
unfortunately, the key map is not working. So I have to guess.
Now you're going to notice it's like very 2D. It did put something in the improved results saying that if like uh 3JS wasn't loading properly, it
would just fall back to 2D. So that's what's happening here.
Oh, hold on. I think I caught a groove there. All right, let's check our autoplay. First up, we have the rock beat. Unfortunately, partially working.
Now, in the meantime, I did also run the 3D printer simulation test with the Q4KM locally through LM Studio with Thinking Off just to see if we
actually got a functional result. So, we'll try it nonetheless. That's so weird that the Q4KM working
locally produced a better result than the model hosted by Mistl through Open Router. And we can see now again, there are perhaps some issues with this
and we can't actually move around to see anything, but the nozzle's moving layer by layer. This is a square. It's printing. It almost looks to be an
inverted pyramid style shape. But nonetheless, it did actually I mean did actually like work better
than what we got online. So that's interesting. And I think as this leads us into the conclusion of this video, something I am going to note here is
the disparity in capability that I noticed with a few tests we ran in this video between the Q4KM of this model running locally and the one being
served online, which is inevitably not a Q4 quantization wasn't really that much. So, it seems like this
does hold up fairly well to quantization and keep in mind that is based off of the very limited amount of sidebyside tests we ran here. I will say
overall I don't find that coding is probably going to be the strong suit for this model. A lot of the results we received had needed some help
initially and like even this which did show some level of promise when we swapped this into open code which
was just being used through the open router API. Unfortunately, it didn't really get it together and the result ended up being significantly worse
than what we see here, which was a little disappointing. The Flight Combat Sim again was it worked first try, but it wasn't great. Our drum kit did
end up working. Unfortunately though, it was a partially functional result, which was kind of the theme
of pretty much all of the software related things we did, even the C++ Skate game, which seemed to have promise. So, what I will say is it did seem to
handle itself quite well with the Q4KM quant comparatively to what's hosted by the cloud. And the creative writing or role-play capability definitely
did seem like it was perhaps a strong point of this model. The Stevie the PC repair man roleplay
was quite entertaining and um very fun to see where it took it givingven that we gave it a lot of open-ended choices here in terms of how it would
respond and it was quite hilarious. So overall, that's really going to be it. And that is going to conclude our first look and test of Mistl Medium
3.5 128B dense. Again, speed here for the Q4KM on an M3 Ultra Max Studio 256 was like 7 and a half tokens
per second, which is kind of strenuous to run depending on what task you're having it do. At the Q8 quantization, which I didn't really show being
tested, but I did test it previously, it was right around five tokens per second. So fairly slow and at 128 dense is going to have that sort of
performance, especially on a unified memory system that can even run it. So that's going to wrap it up for
today's video. Hopefully there are some more fun releases coming out this week. I don't know why I just opened Firefox. Um, and I think there should
be so hopefully there's some interesting things on the horizon. And with that, that's going to wrap it up. So if you have any questions, please do
leave them in the comments. And thanks for watching.