Should You Buy nVidia RTX 4080 Super 16gb GPU for Local AI? Qwen 3.6 Agents? 文字稿

In the age of piles of Mac minis being used to run local agents and 3090s becoming more valuable than gold, the 4080 super is an interesting GPU, but

should you actually buy it for your local AI in 2026? That's the question I want to get into today, so let's get into it. This GPU is really

interesting. I for the longest time didn't actually know this GPU existed. Of course, we all heard of the RTX

4080, but when this GPU came out, the RTX 3090 was such a great value. There was a big question as to why you'd buy this unless it was just for

gaming. The 3090 at the time was under $600 on used markets, and the 4090 was clearly the GPU to buy if you could even find them, especially if you

were listing them to rent on platforms like Vast. And a lot of it came down to memory. The biggest letdown

of the 4000 series was again being stuck at 24 gigs of GDDR6 VRAM, and the 4080 itself was an okay GPU, but what was interesting was the release of

the super series. And there are a number of reasons why I think this is both good and bad for local AI. If we look at the specs here, what's

interesting about this GPU is there's almost no difference between the 4080 and the 4080 super. There are some

advancements under the hood that enable some very interesting things in China, but if you're just buying this at Best Buy or on Amazon, you're getting

the exact the exact same rough count of AI tensor cores, and just a slightly higher boost clock along with basically exactly the same memory and

memory bus, which is also interesting for what the super enables later. Altogether, basically a gaming

GPU. This is faster than the 4070 Ti, which is very fast, but what's curious is the 4080 super in any form of VRAM, which is a bit of a hint, has

become one of the most popular options to rent on Vast AI and TensorDock, which is kind of cool. So, why is that? So, there are a few people that have

actually been benchmarking this already with Quant 3.6, and if you're willing to do a little bit of

offloading with system RAM, again, this is actually a pretty good option. So, if you have this and you bought it for gaming and you just got into

local AI, it's actually even more capable than the 4070 Ti, which you can check out my video here about. What's interesting about this is that in its

16 GB form, I would honestly say you shouldn't buy this. But, what's cool is the initial results of

people using this for local coding appear to be quite good, at least very usable both with a 256K and 128K context in comparison to someone that was

basically talking about using this latest Quant model on their M5 Max 128 GB system, which is much more expensive even if you're buying the 4080 super

at MSRP. What was tough about this video is it's actually really hard benchmarks on local AI for

this GPU, which is curious given what I'm about to tell you. This GPU in its initial form is actually not that great for local AI. And you'll see this

here from Puget Systems. So, these models are a little bit older, obviously, but the relative performance still rings true. When you're able to find

benchmarks for this, this roughly applies across MiniMax, Gemma 4, and a number of other models

meant to be on GPUs that have less than 24 gigs of VRAM. So, you'll see that the 4070 Ti super with 16 gigs of RAM is performing maybe 10% slower, but

it's significantly less expensive. And when you compare the 4080 to the 4080 super, there's functionally no difference. This is within a rounding

error of a lot of these benchmarks. And as cool as this is to see, it kind of makes sense because I

mean, this is like a rounding error on the number of tensor cores here, and the rough performance is relatively the same. Nvidia barely bends these

better. It's just not that much more impressive. And in certain cases, the 3080 Ti actually outperforms the 4080 super, which is, you know, probably

why we saw so many modified versions of this just because at that time they were so cheap. So, in

benchmarks, not very impressive. But, the thing that I really want to tell you guys about, and we'll get to whether or not I would buy this version of

the GPU in just a bit, but of course, China happened to pick this GPU and think, why don't we try to get 32 gigs of VRAM onto it? And this mostly has

to do with the type of memory Nvidia is trying to get rid of from their inventory with the 4080

super. What's kind of cool is the way they did this and the way they arranged it and the way they were binning these GPUs, even though they appear to

have the same memory bus width, is it means you can have RAM that's basically compatible that has twice the capacity or more per chip. And in this

case, the 2x win that we saw with some of the modded 3070s was also possible here with basically no

impact on driver capability or performance. So, this is someone who bought one. This is showing up in Nvidia SMI with their 550.142 drivers, and it

appears to be working. Now, these are also blower style cards, which, you know, huge win if you can find them. This is the same blower style card you

can buy 5090s in, and it's a pretty common OEM part in China now. So, yeah, he said he got a few of

these, basically benchmarked them against his 3090, and it appeared to function quite well. I mean, what's cool here is you're getting a pretty fast

GPU with the same amount of VRAM as a 5090, which is pretty cool. Now, the pricing of these is less exciting, and this is an article from

videocards.net of another person that bought one of these GPUs. A lot of these are really commonly showing up in

Russia prior because again, you didn't have to deal with all of the tariffs involved, which is really the only reason I wouldn't recommend buying

this. But yeah, so a very interesting GPU here, and this variant of the GPU is why it's so popular in Vast. There were a lot of people that before

tariffs went up were able to import a bunch of these along with kind of spare parts and the same

electronics repair shop that was doing modded 2080 Tis managed to get some of these. So, they're not on their website anymore, but these were parts

that were pretty available in the US if you knew the right people. So, in terms of pricing, do these even make sense to buy used? Um largely, I would

say no, uh especially compared to the 4070 Ti and the 3090. Basically, these cards will always cost

more than a 3090 even used. And it's mostly because eBay sellers who are selling these don't really know what they have, and they think all GPUs are

equally useful for local AI relevant relative to their price point. With NVFP4 and FP4 models being so popular, this is another place where these GPUs

fall short because only the Black 12 series of GPUs supports NVFP4, and these are really expensive.

Uh they're also just not very numerous on the platform, and you're generally always going to be paying $1000 after fees to pick up one of these. And

even if you're just looking at an RTX 4080, they're still basically $900, which is just not a very good deal. I would not recommend it. What's

interesting though is these numbers on eBay for the modded GPUs have tariffs included. So, for the price of

like two 3090s, you could get one of these with 32 gigs of VRAM, and assuming you can find an NVLink bridge, that's not the price of a 3090. I just

can't recommend buying these. Uh this is $2000 before shipping. I mean, they say free shipping, but that's assuming that they're not kind of lying

about what they're doing here. I mean, the deal of the century really is this right here assuming tariffs

are included. This is a blower style modded GPU from China. So, it's a two-slot blower style card that is effectively brand new. What's really cool is

we're finding that there are like piles of 3090s that are being found in China that were never actually used, and this is actually the deal of the

century. So, I will link to this below. I don't even have an affiliate on this, but yeah, I would say

100% buy two modded 3090s from China before you ever consider buying a 4080 super with 32 gigs of VRAM. The one thing to double-check is, yeah, they

do have the uh NVLink bridge there. Let me show you what I mean by the price of the NVLink bridge. Oh good, they finally come down now. So, yeah, 100%

every single time, always just buy two 3090s instead of the 4080 super. These really fancy four-slot

bridges are still incredibly expensive. These are $2000 a piece. So, this is an even greater reason to only purchase uh kind of two-slot pairs, and I

I think this bridge may No, this bridge still only has two connectors. So, get slim 3090s, buy those instead of the 4080 super. And this is actually

one rare case where modded GPUs from China make more sense than buying new GPUs from Nvidia or on

eBay. So, I'm curious what you guys think. Do any of you happen to use the 4080 or 4080 super for local AI? Is there another GPU that you think is a a

much better option? I've gotten lots of comments about the V100 32 gig SXM2 modules, so that's maybe a little bit of a teaser for what's coming this

week. And as always, I hope you learned something from this video. Please like, subscribe, and

share, and I'll see you in the next one.

Should You Buy nVidia RTX 4080 Super 16gb GPU for Local AI? Qwen 3.6 Agents? · 全文文字稿