LLMjacking: How hackers steal your AI API keys and stick you with the bill 文字稿

Threat actors are after your AI API keys. How worried should we be? >> I think if you have API keys, you should be somewhat worried. >> I think it's

now opportunity for threat actors to use API keys to basically exploit and also to create resources. Therefore, you have to be worried and to protect.

I think depending on how you're using these APIs and what you're doing with them, you could be up to very worried.

Hello and welcome to Security Intelligence, IBM's weekly cyber security podcast, where our expert panelists turn the biggest industry news stories

into practical takeaways you can use. I'm your host, Matt Kazinski, and joining me this week, we've got Michelle Alvarez, manager, Exforce strategic

threat analysis, and two newbies making their debut on the podcast. It's Urban Marina, X Force principal

incident response consultant, and Patrick Facel, X Force Red Adversary Services lead. Thanks for joining me today, folks. We're going to be talking

about how AI is changing the world of adversary simulation and whether we can patch fast enough to keep up with Mythos and GPT 5.4 Cyber. But first, I

want to keep talking about LLM jacking. Now, this is a relatively new attack where threat actors

steal AI API keys and other credentials to gain access to a users's or an organization's AI tools. Unlike other attacks, however, they're not really

after sensitive data. They just want to use your AI for their purposes because AI costs money and they'd rather stick you with the bill. And here's

the thing, those bills can get pretty huge depending on what hackers are doing. Uh, a developer from a

small startup in Mexico shared a story on Reddit about how thieves racked up $82,000 in 48 hours using their stolen Gemini key. To put that in

perspective, the dev said that their normal monthly spend was $180, right? So, from $180 a month to $82,000 in 48 hours. Pretty significant. Erband, I

want to start with you because you brought this to my attention. I was hoping you could I you could give

us your initial reactions to LLM jacking. Why did this catch your eye? What it got you thinking about? How do you feel? >> It's definitely something

very interesting because I think we have to set up and also just analyze the evolution of hijacking. So for example, you see in early 2000s you see

more hacker resources for usually sabotage or just to use command and control. Then with evolving of uh

cryptocurrencies you saw that it's profitable for threats basically to mine on your devices especially cloud uh with high GPUs which basically will

make them to profit and to use the money for their operation. But now we are seeing evolving because now you have opportunity to use cloud for mining

because it's free. It's on top of someone else uh cloud account or on top of their bill. And now it's

opportunity to make money but in the same time as well to use the API to basically uh to do R&D to do research to build weapons. So it's I would say

it's it's evolution and you see if the actors are actually uh winning because they are building and they are meeting their objectives that are

planning the other sides the bill is basically and cloud providers are are getting the bill and the problem

is the end user now you have basically struggle and imagine this for example 100k bill it's sometimes equal with a bunky and it's not just about the

money itself. It's about also the leakage of data. It's about the exposure. But in the the plus thought use is kosic in this case the for is something

that I think we have to be more word and we have to make sure that we are getting the direction to

raise this topic in community and to start to talk about more in even though there are some discussion there are some minor techniques but it's really

type of attack with blood use catastrophic which in the end of the day has to be more uh considered and to take it more serious. Absolutely. And I

like that you point out that this is not just kind of a way for them to get in there and stick you

with a big bill. It's also a way for people with less than good intentions to get access to some of these tools that they wouldn't otherwise have,

right? Like we've seen recently that the Frontier Labs are really trying to lock down who can get access to their tools. But if I steal someone's

legitimate API key, I can get in there and use it for my own malicious purposes and I never had to get my

own credentials. So I think that's an aspect of this that I personally was not even thinking about. I was mostly just shocked by that massive price

tag, but that's significant. So Michelle, I want to pivot to you here and I want to ask you about something that KPMG's David Nidz said when he did a

write up about this. Uh, one of his top recommendations was to treat AI uh API keys like the crown

jewels. But I want to know what exactly does that mean? Could I get your take on on securing AI API keys? What we need to be doing here? What are you

thinking about? >> Yes. So I mean it's just like your password, right? It's something that should be treated as super secret. Uh there's no difference

there. And I think also something else that I honed in on uh is something in the Reddit post about

the $80,000 something dollars that was charged. The question was, you know, there's no guard rails. What's going on here? How come I I didn't know

about this anomaly until it was too late? And I think you know unfortunately in cyber security it takes a major security event like this for uh

individuals and organizations to start recognizing that something needs to be done. We saw this with credit

card companies right now. I get notified if there's something strange going on and I can say no that wasn't me um you know that charged you know

several hundred to purchase something in a state or country I I'm not in currently. So now there's guardrails in place. And so I think that's what is

going to happen here with this particular incident. >> I think it's a very good point that you bring up,

right, that sometimes it kind of takes events like this for us to see these blind spots, right? And and you also point out one of the things that's

very interesting about these attacks is that people report that even when they have like usage limits or spending uh caps on their accounts, sometimes

the hackers move so fast that they blow through those limits before the the services can even detect

that there's been something wrong. Right? So this is in a in a way it's almost I don't want to say positive but the silver lining here is okay we've

identified a gap in our guard rails. Now let's work on that. Patrick, I'd love to get your takes especially as somebody who does adversary simulation

who looks at things from an attacker's perspective. What are your thoughts on these kinds of attacks

and what we should be doing to protect ourselves? >> Certainly one of those things that you could think a lot about, but anytime you think about any

authentication material and API key is obviously a great example of one of those. It's always I always like to put it in context. So I like to ask the

question, well what does this get me as the attacker? What can I achieve with that? And obviously

we've touched on that a little bit talking about you know cost overruns and you know abusing these uh models to uh maybe achieve some sort of

offensive tooling or something along those lines which is very interesting but you know often API keys are intended to be built in or a part of some

larger development process. So you know is this is this API key being used for something else or is it just

uh being able to call directly to a model and get some output in which case it's not the most interesting thing but if it's part of a larger

application flow you may be able to disclose other data or gain access to other underlying systems. So I think it's all about understanding what is

the key for what does it do and what does it get me access to that could further my goals as an attacker. >> I

think it's a really good point right like you said these things are baked into these complex software supply chains. you don't know what this API is

connecting to and as an attacker, you know, you can get in there and like you said, if it's connected to the right kinds of things, you can get some

access to some really juicy stuff. Um, I want to uh circle back around to you to kind of close out our

discussion here today about LLM jacking. You know, what is the one piece of security advice you would give any user, organization, whatever, when it

comes to protecting their AI API keys right now today? What should they be doing? I think like also seeing from the incident response cases I think

there is a gap on the knowledge on cloud first of all and also it's a bigger gap on between cloud

security and devop pipelines I think the most important part is to have a proper uh knowledge education security testings multi-layers of of defense

in this case and the most important part is that I want to highlight that uh you need to have a proper control with all parts of cloud for example in

control plane in the data plane in this case but the most important part which I want to highlight

especially with this type of of of attack is that by default things in cloud are usually insecure so you have to go have a proper testing have a

proper preparation pre- readiness of incident response and to make sure that everything that is off by default to be turned on and in the same time to

have more and more automation And as well to have that defenders we have to be closer to the devops to

have a proper dev sec ops operations and it's all about automation. So you have a good pipeline you have uh good automation testing and in the same

time we just want to make sure that secrets are kept in the right place. So you have a cit management and you don't have in public GitHub which is

very common or just in variables which is uh it's all about hygiene. So just to sum up good practice uh

don't let things by default because usually are insecure and good practice and good hygiene with cloud overall especially with DevOps pipeline. >>

Absolutely. You can't take security for granted especially not in those cloud environments. Uh I I really like that uh extremely comprehensive

breakdown. Patrick, anything you would add there in terms of like security steps people should be taking right

now? >> I think that he covered most of the really big things. I I would only add, you know, emphasizing particularly on the testing side. We often

make assumptions about what uh what we've exposed when we accomplish particular task and developers are probably uh you know are always in a position

to do a lot of damage because they tend to have lots of access and they're the ones who are actually

building these applications. Uh they need lots of support to make sure that they can accomplish them in a security conscious way that still get their

jobs done. I think that u you know uh things like secrets managements are really important but also every time you implement some changes having an

ability to assess have we exposed ourselves to something new without realizing it. >> Absolutely. That

that visibility is so key. I mean you really don't know what kinds of ramifications your changes can have unless you are looking at them and looking

for them. Uh Michelle close us out here. Anything you would add in terms of security steps? Yeah, I mean I think both Urban and Patrick really

provided a lot of different steps for especially from the enterprise perspective, the defender side um but

also the end user. We also have our responsibility in terms of um controlling the exposure surface, right? How are our keys stored? I think I heard

you know secret password managers mentioned um you know how are we handling our keys treating them as um highly sensitive as I mentioned in the

beginning. So both sides have responsibility um on the side of the providers we also need to ensure that uh

you know a leaked key ends up in bankruptcy as it seems like potentially there's some cases out there where this is happening and that's pretty

serious. >> Absolutely. And I'm glad all three of you hit that secrets management angle because we know that, you know, credentials uh attackers are

after them. When they get in, they try to get more of them. There's potentially massive blast radius, so

those things have to be locked down. I'm going to move us along to our next story for today. But before I do, just opening up the conversation to the

listeners out there watching this on YouTube, tell us what you're doing to uh secure your AI credentials right now. What steps are you taking? What

are you thinking about uh to make sure that nobody gets in there and misuses your tools? Moving along

though, we're talking about adversary simulation in the age of AI.

Now, Patrick Facell is on the panel today and he's also the author of this piece we are discussing uh published on IBM Think entitled The Adversary

Didn't Wait, Neither Should You. Now, I don't want to put words in your mouth, Patrick, but the gist here that I took away was that threat actors are

using AI to amplify the speed and intensity of their attacks, which means our adversary simulations

and offensive security research should reflect this new reality. At the same time, the human is still a crucial part of that loop. So, tell me uh in

your words, what were you uh arguing here? What's the what's the message of this piece? Tell us what you're thinking about. >> Sure. And Matt, I think

you nailed the big point, but what really inspired me to start thinking about this a little bit more

was um I don't think that mythos is new news to anybody. Um everybody was talking about project last wing over the past couple of months. It's

obviously a very big thing in the security world. Uh but what was so interesting to me was that there a lot of this conversation was around things

that I did I felt like was missing the larger point. And so, you know, the initial glasswing posts from

Anthropic focused fairly he very heavily on this idea of discovering vulnerabilities. So, you know, analyzing code and understanding that there might

be a problem with it, which is super important and like I'm glad that it's one of the things that we're looking at, but that's not exactly or not the

only thing that an attacker is going to do because they don't probably don't have access to all of

your source code. You know, hopefully that they don't. um they're thinking about how can I use or how could I use a model to effectively make me more

effective as an attacker like things like getting access and so what are all the phases of things that they're going to do where they bring AI to bear

and um while mythos is an important conversation point I'm actually glad that it's it's shined a

light on this concept it's something that you know my team and I have actually been thinking about for you know six or eight months not just since

mythos this isn't brand new right models have been around for a while now >> absolutely And can you elaborate a bit on, you know, the importance of

having a framework, right? One of the things that comes up in your piece is this X-frame idea. Could you

tell us a little bit about what that is, why it's important to have that kind of guidelines when you're doing this work? What can you tell us? >>

Sure. Um, you know, the it's obviously one of those things that uh has a ton of moving parts, but the thing I would always start with when we talk

about this is uh if you're running a company and you're let's say you're the CTO, would you let AI run

rampant on your network and do whatever it wanted? I think almost everyone's going to say no to that. And if you say yes, then I would love to have a

conversation with you and get some more uh why behind that. I think maybe most of us are aware of a recent news story about um an AI that deleted some

very important parts of a company and um had some serious impacts. So uh I think you know the first

part is sort of the safety and security. AI is not at a point yet where we can let it just go off and do whatever it wants. But more importantly it's

um about leveraging AI in a way that makes sense. So I think when AI was, you know, uh, generative AI came to the forefront very we thought it's going

to solve all of our problems, but very quickly we talked to some of our data scientist friends and

we find out there are things that AI is going to be good at and there's things that it's not going to be good at and framing the problems in a way

that really lets it do its job well is important while also leveraging the expertise of the people behind the keyboard making sure that they're um as

effective as they can be by you using these systems to improve themselves basically. >> Absolutely.

That makes a lot of sense to me and it reminds me, you know, you mentioned like letting an AI run rampant and it reminds me of the story we covered a

little while back where uh folks at Sofos experimented with Open Claw as a as an AI redteamer basically. And one of the really interesting things to

me there was that they ran into this problem where they struggled with getting the guard rails just

right so that it didn't do any damage but was still useful. Like it could still actually find things but not break things when finding them. And I

feel like that's a perfect encapsulation of like why there's got to be a human still in the loop, right? Like there's got to be that human oversight

to make sure things are useful and not breaking things. Um Michelle, I want to bring you in here now

looking at, you know, Patrick's piece, thinking about what we've discussed so far. Uh what are your thoughts on keeping humans in that adversary

simulation offensive research loop? What are you thinking about? >> Yeah, of course. And when I read the article, which great blog by the way,

Patrick, um, of course, one puts themselves in their own shoes, right? You're in your own shoes like what's my

role? How does this apply to me? Um, and similarly, you know, I think threat intelligence analysts also need to be in the loop when um, looking at

threat intelligence because of course AI is great for a lot of things um, in this space. you know I can give it a report it can scrape the IoC's um it

can you know aggregate a large amount of data and we can kind of look up some of the dots and connect

those that way um but you know for actual threat intelligence which requires interpreting all of those signals um what they mean assessing uh

adversary intent just sort of adding all of that contextual data um is really when you also need the human in the loop for threat intelligence. So,

for that reason, I think there's a lot of correlations and I would imagine uh Urbland feels the same way. Or

feel free to play devil's advocate and say, "No, I don't need to be in the loop." >> You kind of read my mind there, Michelle, cuz as you were

talking, I was like, "Oh, it's really interesting to get, you know, we've heard Patrick's take on how it's affecting his field in adversary

simulation. We've heard your take on how it's affecting your field in, you know, threat intelligence and why the

human is still important in the loop there." Urban, any thoughts on how it's infect affecting you guys over in, you know, incident response where the

human still fits into the loop there. What what are your thoughts? >> I think incident response is still very uh I would say sensitive topic

especially with like chain of custody like artifacts for zero artifacts. Uh and the idea is that it really

depends. But something that's I love about AI is that for some automation for some low hanger fruits it's extremely good because it has some

automation it can do a lot of uh tooling for you but the issue is that still I think human is very important is it helping definitely yes but I heard

a very good quote from from my friend one of my friends uh a few days ago and it was AI is good as you are.

So it's also is really depends who is behind the AI who is building the detection who is responding with AI. So I think if it's not properly uh in a

right framework if it's not basically guided by the proper incident responder in this case or sock team or incident manager I think it's very

difficult to to to have basically a right step of incident response in this case but in the same times to be

the human in loop which is knowing what the is looking for. So just to summarize I would say that for low hanging fruits for automation for giving you

the direction is going or for example for decoding decryption or just helping you it's extremely good but for a whole chain of attack I think it's

still human is the most important factor here and again I I see a lot of potential it's helping a lot

but again it's it's in response also about the chain of custody it's about the uh like for example a lot of other points which I think we need to be

very careful when implementing especially in the solitude cases. >> Absolutely. There's there's two things that I want to highlight. You know the

first is this quote um about you know AI is only as good as you are. I really like that and it reminds me

of what is kind of becoming the show's like unofficial motto which is something that Dave McInness said you months and months and months ago and now I

bring it up like once an episode which is you know AI agents are the most helpful insider threats we've ever had right and I feel like it is such a

perfect encapsulation of where we're at with these things. Um, so that was the first thing. And the

second thing I wanted to kind of, you know, call out there was I like how you highlight the accountability issue, especially in something like instant

response, right? Where, like you said, things like chain of custody are still really important. And you can't really make an AI accountable for

something like that, right? Like we just we're not there yet. I don't know if we ever will be, but we

need to think about these accountability issues, especially as we start integrating these things into our workflows and they start taking more and

more autonomous actions. Uh to close out this segment, Patrick, I just wanted to circle back around to you. Is there anything we have not discussed

here or any major takeaways you want to get out there to the audience before we move on? >> I don't know

if it's uh necessarily um you know, Michelle said playing devil's advocate. I'm not sure if it's necessarily the uh the opposite side, but you know, I

will say I think that AI brings an entirely new platform the way that we think and approach problems. And one of the things that we haven't done yet

as you know teams and companies is think about that we are structurally changing the way that we do

business and you know thinking specifically about adversary simulation. If I have an AI that can do you know steps A through C. I now need to stop

having my teams do A through C and focus on what the things that we do. You know what are these um which is great because now they can they can focus

on these really novel attack chains and developing the things that really give us the power. But we

have to be willing to let go of things the way they were and focus on where does the human bring value and make sure that's what we're spending our

time on. >> I like that. That's a really positive framing because I think it's it's very easy to kind of be a doomer about some of this stuff. You

know what I mean? But I really like that reframing of like look things are going to be different but we

can embrace that and we can do something really cool with it. Um so on that note, let's move along to our final story of the day folks. I'm talking about speeding up our patch timelines.

Reuters reports that CISA, the US Cyber Security and Infrastructure Security Agency, is mulling over a change to federal patching standards in the

wake of Mythos and GPT 5.4 cyber. So, current guidelines give federal agencies 2 weeks to fix a flaw under active exploitation. Uh, and people are

talking about cutting that down to 3 days. You know, it's understandable why we people would want to cut

that down, right? AI tools let attackers find and exploit vulnerabilities faster than ever. A couple weeks ago on the show, we talked about the zero

day clock and how they were tracking that, you know, in 2018 it took an average of 2 years to exploit a flaw and now it takes less than a day. So, I

understand why we want to move faster with patching, but the question is, can we actually move that

fast? A lot of experts are a little skeptical of that timeline or just our ability to cut down patching in general. And Michelle, I'll start with you.

I'll give you this nice difficult question. Uh what do you think about shortening our patching window? Does it seem like it's something within our

grasp or no? >> Sure. Like I think we should talk about that, but also first I'm wondering if we're

asking the right question because patching is just one thing, right? Um there are other necessary mitigations to uh both mitigate exploitation and

also if there is exploitation reduce damage. So I get it. We should talk about like reducing the patch window and what are some things we can do that.

I don't know how we went from two weeks to arbitrarily three days. Um, you know, maybe there's some

ways to sort of um incrementally bring that down to a shorter patch window. But again, I'll ask respond to a question with a question. Should we be

asking about, you know, the patch window when there's so many other things that maybe are even um more um detrimental to potentially a incident

occurring? No, I think that's a really good question, right? Is like should is that the right question to be

asking? Because again, as you point out, patching, that's one thing. That's one part of this whole defensive posture we can strike. And Patrick, I saw

you nodding along there. So, I'm going to ask you, you know, thinking about Michelle's response there, you know, whether patching is the right

question to ask. What are your thoughts? You know, uh whether you build on that agree, disagree, where you

take it. >> Yeah, I think Michelle na n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n nailed it. And I think that the

really key element we have to start with is everything's a matter of balancing resources. The way that I always frame this when we're working with

clients is um our goal as people stopping hackers is to raise the cost of what it takes to affect a

breach. So they spend so much money getting the, you know, the breach done that now that they've got a negative return on their investment. If we can

raise that cost high enough, they'll go away and do something else, right? Um and so you managing the resources that any company has is really key.

And if we're so and we think about how complicated patch roll out could be on production level systems

if it's breaking things that becomes an untenable stance. So I think Michelle's right you know um certainly patches can be an important part of the

process but it's it's if we also think about this from the attacker standpoint of um you know finding a vulnerability and exploiting it. Well, we're

making assumption on the patch thing that uh vendors are writing the patches fast enough to roll them

out that 3 days even matters in the first place or that they're even going to know about these things because they're going to happen so fast. So, um

I think that while you know if you have a patch system that allows for that, great. Good for you. I'm glad to hear that you can roll those out. But

there may be other things that you need to focus on before you're thinking about trying to achieve

this what might be a big target for some organizations. >> I'm glad you bring that up and you kind of un unpack what is involved in patching, right?

like you said, like there are factors outside of our control here, like did the vendor write a patch fast enough? You know what I mean? And and the

downtime that's required. Can you shut down production for as long as you need to for this patch? Like

there's a lot of calculations that go into it. So when you're like one guy sitting at a laptop, you think ah patching that's easy, click update, not

so much when you're talking about an enterprise network, you know. Uh I want to bring you in here, you know, uh thinking about what's come up so far,

thinking about these ideas about uh shorting the patch window. What's got you thinking about what are

your concerns or hopes here? >> We are in the moment which basically you are fighting against machines because it's cloud it's with us etc. So I think

we should start thinking to have also machine versus machine because it's it's very hard that you have the using complete this autonomous AI and you

are patching the spreadsheet you need five six uh approvals to basically patch something. So I think

there are a lot of uh we have to be agile. We need to use more microservices and the most important part as Michelle said is defense in depth. So we

shouldn't be only in one line. We we should have like all the phases of detection preparation. So I think we have it's it's it's very hard question to

be honest and it's it's it's extremely complex and very hard day to be a sisil but we should go in

that direction of agility to be to be faster because and also to have machines in team so what can we do in this direction to be faster and in the end

of the day as I said machines versus machines and less spreadsheet more agile more microservices and multiple layers of of detection automation which

I think it's is the response here but again it's is very difficult especially for 3 days very

arbitrary and it's and it's not just about one application what level it you have libraries you have application you have system OS in this case so

it's it gets complex and more complex >> absolutely and I recognize it is a difficult question and that's why I unfairly asked you three to tackle it

in 10 minutes. Uh that's what I do on this show. Um but no, so you know, thinking about these kind of

complexities that we've all addressed so far and I really like how you tied it back to our previous conversation about where you bring AI into these

things and how maybe we can automate some of these defense in-depth steps. So to end the episode today, I always like to end on a very practical note

because that's the promise of our show. So Michelle, swinging back around to you, we talked about

defense and depth. We talked about the things we can do aside from patching. What's the one piece of advice you'd give organizations right now when it

comes to dealing with the the sped up time frame of vulnerability finding and exploitation? What are you thinking about? >> Well, I think we've

mentioned this quite a bit on this podcast. It's knowing and having visibility across your network. So,

knowing what are your critical assets um and being able to respond quickly. So, I hope I didn't take any thunder from Iran because that's instant

response. You have great panel here today. A lot of disciplines represented. Um but I think that's first and foremost what do you have to know what

you're protecting in order to protect against it. >> Absolutely. And I think that's especially you know uh

um relevant for a patching conversation because so many of the difficulties of patching is around that visibility. Uh Patrick, how about you? What's

the kind of one piece of advice you'd be telling organizations to follow walking away from this? Yeah, I think um the you know most uh effective

organizations tend to have this concept of assumed breach built into the security fabric of their

organization. So we think you know um I think a lot of people want to think we have whatever XYZ control edr or whatever it is. Um there's no way that

an attacker is going to get past that. I think you know these groups are sitting around all day every day researching how do we apply these things

that we're talking about to affect the breach and so uh organizations that respond the most quickly

and have the best controls are always the ones who think yes we are going to be breached are we ready if we look across the perspective of places that

can happen uh what do we have there that's going to help us respond to that and be ready and isolate and move quickly um and they practice and

exercise those things >> absolutely and Patrick one of the things I appreciate about you know your

appearance on the show today is a lot of your advice has come down to like changing your mindset about some of this stuff and approaching it from a

new perspective and I like that because it's like all right like you said assume breach you know don't assume that like oh no one's going to get past

us assume we got all these protections in place someone's going to break them anyway what do we do

then I really like that uh Urban to close us out today what's the one piece of advice that you would give organizations around this topic >> I would

end with a quote it's it's by JFK which I love is like the best time to fix your roof is when the sun is shining which basically means that do your

preparation do your homework do your incident response plan and exposure plus knowing your your

environment I think is key so you don't it's it's it's more now proactive security than just waiting proactive like I have an IR now I have to fix my

roof no preparation preparation >> absolutely I love that preparation preparation that does it for this episode I want to thank our panelist Michelle

and Patrick and Urbland. Thank you to the viewers and the listeners. Thank you to our producers.

Subscribe to Security Intelligence wherever podcasts are found so that you never miss an episode. Stay safe out there and lock up your API keys.

LLMjacking: How hackers steal your AI API keys and stick you with the bill · 全文文字稿