NVIDIA's New Free AI - A Gift To All of Us 文字稿

This AI is not Neotron 3 Super. No, this is Neotron 3 Ultra, Nvidia's newest free and open AI model, and I've been delighted, disappointed, and

confused by it. But I think I got it now. You see, you can look at the benchmarks all you want, but we are fellow scholars here. We don't just believe

stuff. We test it for ourselves. That is the way of the scholar. So, I had an early look at it and ran

some of my experiments day and night. First impression is that it is incredibly fast. Blazing fast. Love that. But then my coding experiments did not

go that well. When I ask it to write a light simulation program, this is my original area of research and I get a black screen. Nothing. When I ask it

to fix it, it does a bunch of things and same. And then I said, "Okay, let's debug this by hand."

It had some mistakes. After fixing that, well, we get something. But maybe it's a scene that does not work at all. Other even smaller systems can do

this task with relative ease. And the other thing is, goodness, it wrote up more than a thousand lines of code. You don't need that much. My

handwritten solution from my research is about 250 lines and renders this scene. Fully open source, free for

everyone, forever. Now, let's write a realtime strategy game. Yes. Oh, no. Black screen again. Almost. We got a square. But if you ask Deepseek 4

Flash with the same prompt, you get something really cool. But not here. So, what is going on here? Well, I went back and forth with Nvidia and

reported some of the issues and later there were some improvements. But still, this kind of coding is not

something I would personally use this for. So I said, you know, maybe let's not use this AI. But then I thought, wait, it is super fast and probably

good at other things. So I gave it aic things. Fixing broken installations on my machine from the terminal, excellent. Whipping up quick experiments,

organizing files, excellent, super fast. And over time, I found myself reaching out to it more and

more. And I found it to be useful basically for everything other than challenging coding tasks. Now that is excellent because this might be the

openest AI model ever. Weights are open. The research paper on how it was made is open. Training data and recipes are being released at least for the

redistributable parts. Now that is pretty crazy. Now hold on to your papers fellow scholars because it

gets even better. Licensing. Super important question, very overlooked. We are always hoping for Apache 2.0. This is the do whatever you want license.

For me, this is 10 out of 10. Now, Nvidia started publishing their models under their own proprietary license, which I would rate 7 out of 10.

Derivative works and commercial use is fine. On the other hand, it needs a bit of attribution and a little

stricter on patent grants. Now, this has the open MDW license. This is basically Apache 2.0 tailored for machine learning weights. This is absolutely

fantastic news. Glorious. I think this might be a 9 out of 10, maybe as close to 10 out of 10 as you can get from a big company like Nvidia. Allows

basically everything, but less battle tested. And my understanding is that if you sue claiming this

model infringes your rights, you lose the license. Huge improvement. Double thumbs up. Thank you. Now, can you run it yourself? Hm. Um, yes and no.

Yes, because completely open. Download it. It is yours forever. No limits, no funny business. However, no, because I would love to run it locally,

too. But it's huge. 550 billion parameters. You need hundreds of gigabytes of GPU memory for that. This

is why I will probably use it on Lambda. Also, 1 million token long context window. Great. Have a larger code base with a bug hiding somewhere. No

worries. Massive box. Easy. Okay. How about images and videos? Well, it does not have vision capabilities. Not multimodel text only. Oh man, how much

I would love a multimodel version of this. Goodness, please. Okay, and I also had a realization. You

don't need one model to do everything. You need a roster of models that cover your use cases. For instance, I can't add vision capabilities to Neatron

3 Ultra, but I can bolt Gemma 4 to it with a screwdriver. It's like a seeing eye dog guiding a smarter blind man along. It is hilarious and it kind of

works. Kind of. So, we finally have more competition in the open AI model space and that is

glorious. So, how does it work? Well, one trick is that it is huge, but not all of it runs at once. 550 billion parameters total, but only about 10%

of that is active per token. These are specialist mini brains that are being activated at a time. We call that mixture of experts. But you wise fellow

scholars know that already. So what else? Now they also use mambber layers. Why member? Is this like

a snake or like the fruity chew? I don't know. I don't even know why I brought this up. So what do these do? Well, traditional AI systems have a bit

of a memory problem. They work like a student who constantly rereads the textbook over and over again when they are given a question. But memory is

precious. So instead read the book only once and take highly compressed notes. So this kind of memory

remembers important details about the conversation. However, it is also smart enough to throw away the filler words. Thus, this system can process

massive amounts of data efficiently. It also uses low precision numbers, so you have to do less number crunching when running this. They call it

NVFP4. And this doesn't rely on predicting tokens one by one. No, it has multiple heads that draft multiple

future tokens at the same time. Once again, many things that make it blazing fast. And we get all of this for free forever. What a time to be alive.

Thank you to everyone who worked on this and absolutely everyone everywhere who is working on open-source projects and open models. You are all

heroes. And look, this system is great, but it could be tiny. It could be bad, ugly. I don't care. As long

as it is open science and open models, it pushes humanity forward. Thank you. What a time to be alive. Here you see me running the full Deepseek AI

model through Lambda GPU cloud. 671 billion parameters running super fast and super reliably. This is insane. I love it and I use it on a regular

basis. Lambda provides you with powerful Nvidia GPUs to run your own chatbots and experiments. Seriously,

try it out now at lambda.ai/papers AI/papers or click the link in the description.

NVIDIA's New Free AI - A Gift To All of Us · 全文文字稿