SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

Dwayne McDaniel, developer advocate at GitGuardian.com, joins host Priyanka Raghavan to talk about the engineering challenges of secrets management. They explore what “secrets” really are in modern systems—far beyond passwords—including API keys, tokens, certificates, and machine identities, and how “secret sprawl” emerges across the SDLC. Drawing on reports from GitGuardian and Verizon, they discuss the growing scale of secret leaks and why credential abuse and phishing remain dominant attack vectors.

They examine common leak points—from code repos and logs to CI/CD pipelines, containers, and SaaS integrations—and how cloud, DevOps, and AI tooling are amplifying risks. Priyanka quizzes Dwayne about recent supply chain attacks from pyPi and trivy ecosystems, highlighting recurring root causes like poor access control, long-lived credentials, and weak security hygiene. Finally, they consider detection, response, and modern solutions—short-lived credentials, secret scanning, and identity-based approaches like OWASP NHIR and SPIFFE/SPIRE—ending with practical advice for engineers to reduce blast radius and design for secure secret lifecycle management.

Brought to you by IEEE Computer Society and IEEE Software magazine.

Show Notes

Related Episodes

Other References

Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Priyanka Raghavan 00:00:19 Hi, this is Priyanka Raghavan for Software Engineering Radio and my guest today is Dwayne McDaniel, a developer advocate, hands-on practitioner, and also host of the security repo podcast. We are here to talk about the engineering challenges of secrets management. Dwayne regularly presents on secret sprawl and secure engineering at events like RSA, BSides, OWASP, CubeCon, et cetera. So welcome to the show Dwayne.

Dwayne McDaniel 00:00:47 Thank you very much. I’m glad to be here.

Priyanka Raghavan 00:00:50 Today at SE Radio we have done Episode 578 on Secrets Management with Distributed Cryptography. We did an Episode 311 on Secrets Management, and recently we did an Episode 680 on Security and Privacy of AI Coding Assistance. However, it’s 2026 and we’ve already had two big supply chain attacks. One with the pyPi ecosystem last week and then recently, was it just yesterday that we saw the Codecov getting leaked, the repo? And it appears that both of these were because of secrets leakages and therefore this show is well timed.

Dwayne McDaniel 00:01:26 Yes, there’s been many, many things going on. I had to even hesitate to say as few as two it’s ongoing campaigns. There was a time when you could refer to a specific product like SolarWinds because that’s what it affected Codecov. Yes, there was one thing, but now we’re seeing the proliferation through the ecosystem at a rate that, calling it the Trivy attack or the Aqua attack or the LiteLLM attack or the Axios attack is underselling what’s going on here. It’s a massive campaign that literally breakneck speeds. I have never seen anything propagate this fast through the ecosystem. So, it feels all very interconnected. Cisco, that was news yesterday as well as the time we’re recording this. And that all feels very, very interconnected. The idea of that’s steals credentials from people who have the keys to, well all of the infrastructure that makes the internet, that makes all of these ecosystems therefore we can propagate at our leisure. There’s been a massive increase in the number of crypto mining schemes directly related to this. It’s just we’re watching machine speed attacks using old fashioned API keys that never expire.

Priyanka Raghavan 00:02:39 Wow. So, I think we then have to probably spend some time on some of these case studies, which we’ll do later on in the show. But I actually wanted to ask you a couple of definitions for all our first time listeners who are not aware of secrets managements. So, the first thing I wanted to ask you is what do you mean by secrets and secret sprawl?

Dwayne McDaniel 00:03:02 When I say secrets, we’re talking about the large bucket of credentials, access mechanisms. There’s a really good definition I like from Ev, the CEO of a company called Teleport. He says it’s any piece of data that by itself can be used to gain access or to grant access. That’s a very succinct definition. So, anything that you can put into plain text that someone else could simply pick up and use immediately. My favorite example of this is Postgres database connection URL, because those tend to bake in the credentials right there in the URL. Same thing with GitLab links, if you do it wrong or you’re baking your own username and API key directly into the string. So, it takes many forms. We’re also talking about certificates themselves that contain a secret that if I find a valid private key, then there’s a lot of things I can do with that. So, these take many, many forms, but anything that can be represented as plain text that when used grant access, that’s what we mean by secret.

Priyanka Raghavan 00:04:03 And the “secret sprawl”?

Dwayne McDaniel 00:04:05 Oh, “secret sprawl” is the phenomenon where well, they get into plain text, they get away from you. We put out this report, we’ll talk about a bit later, the state of “secret sprawl” report and that’s specifically what we mean is these have been leaked somewhere that is accessible to someone aside from the application itself, like in memory.

Priyanka Raghavan 00:04:26 So we talked right now about different forms of secrets. You talked about passwords, API, keys, token certificates. Are there any other flavors of secrets?

Dwayne McDaniel 00:04:37 Oh, many. There’s many ways to grant this access. As many systems as there are on earth and there’s a lot of systems on earth, but it’s just easier to categorize them all together. It’s semantic differences at some point. And there are systems out there that do make a distinction between an authentication key and an authorization mechanism, an authorization token. That’s actually kind of an important distinction in my opinion. We have married together this world of authentication and authorization in one form, and we call that the API key. If I have this key, then I have standing privileges to do a thing. That’s not great because now that’s a very messy world and when I revoke the access, I’m also revoking the authorization. But also, now I’ve tied them together in a way that it’s tricky to say, look, you’re allowed in, but you’re only allowed to do these things because again, that key is just the key. Other problems are like they don’t expire by default. We’re moving to a world of certificates, X509 certificates and JWTs or JSON web tokens that do have set expiration dates or time periods when they’re good for, but we’re seeing that adoption lag. Because most systems just traditionally used, hey, this key gets you in and lets you do a bunch of stuff.

Priyanka Raghavan 00:05:50 Okay, so that brings me in our next question, which is where do you store these secrets?

Dwayne McDaniel 00:05:55 Ideally you don’t have to because you’ve gotten rid of them and there’s many other ways you can approach access and authorization that do not require a long lived secret or just in time access that simply makes a token right when you need one. And that’s the ideal situation in our opinion. But the that messy reality of most companies is they have a ton of these, and the best next answer is vaulting. They store them encrypted at rest and only pull them into the application when necessary and for just long enough to actually be used, then immediately erase or dump from memory. There’s plenty of providers out there. I don’t want to name all of them, but just as your audience, you know who I’m talking about. HashiCorp Vault, there’s open box with a complete open-source project. Now, based on that, CyberArk has Conjure, like I say, there’s many companies that provide this. Delinia, Doppler, vaulting technologies aren’t unique. There’s a lot of them. But that leads to its own problem of we have a lot of vaults and now there’s this phenomenon called vault sprawl where wow, where do I put the key, which vault’s the right vault? Are we sure we rotated it correctly across all the vaults? Like it leads to its own world of problems. That’s why I say it’s second best to a world where you’re accounting for identity and issuing just in time access that’s only available for like one time only.

Priyanka Raghavan 00:07:15 The last question I want to ask in the definitions part of our podcast is who produces this secret in the typical software development lifecycle

Dwayne McDaniel 00:07:25 That is going to depend on the company and their particular governance? Governance I don’t think can be productized. Governance is the end result of a set of conversations, decisions, what your hierarchy looks like inside of your organization. But the vast majority of these are simply created by developers who have the level of access to make API keys, which is pretty much every developer on earth. If you think of systems like GitHub, everyone with a GitHub account can go and make a personal access token. You think of things like Salesforce and HubSpot, you do need a certain level of permission, but developers tend to have that level of permission because well they’re developing, they need access to those endpoints in order to get their job done. So, the easiest way to think of it is if I need access to an endpoint or to make something happen on a platform, be it a simple SaaS platform that does one thing like Canva or a very broad platform like AWS EC2 instances, you can do anything on those things. That is where these secrets come from. Infrastructure and actual, the world of non-human identities that need to connect together. It’s a term that has a lot of controversy around it because you’re defining a negative. But if you think of workloads like in the Kubernetes world or running pieces of software that are executing within larger contexts, that’s where these keys really are. That’s what the secrets really go to.

Priyanka Raghavan 00:08:42 So let’s talk about how big the problem is and here I’m going to use the latest GitGuardian report and I’m quoting from the report, the number of hardcoded secret such as API keys, passwords, certificates pushed into Git repos amount about 28.65 million in 2025, which is about a 34% increase since 2024. So, can you talk a little bit about these numbers and why it’s increased so much as per your analysis?

Dwayne McDaniel 00:09:10 Absolutely. Well, so let me start by explaining what we do and how we got to these numbers. And that’s by the way, just public repositories, the numbers you just listed. So, since 2018, we have looked at every new commit that hits GitHub in public. You can as well. It’s a public feed api.github.com/events. It’s a fire hose, almost 2 billion events last year. We’ve never seen this much code being pushed onto GitHub in public and we scan every new commit and every new thing that becomes public that you know, used to be private and we have over 600 detectors on our platform. We do contextual analysis as well. Meaning is the string being used in a way that grants access or is it used in a way that if it didn’t grant access then the app wouldn’t work? That’s the easiest way to think of it. There’s a little more detailed than that or nuanced.

Dwayne McDaniel 00:09:55 We send an email to the committer right then and there automatically we call it a Good Samaritan program. Your listeners probably have gotten an email if they’ve ever pushed code in their life. I know that’s how I discovered this company existed years ago when I had shared my private SSH key into a public repo by accident. But yeah, just in the year 2025, that’s a cumulative number. Just in the year 2025 we saw 28.65 million hard-coded credentials or secrets added to public GitHub repos. That’s a 34% increase over the previous year. Again, not cumulatively. If you look at the exact same methodology, we went back and applied the exact same methodology to the commits that we had seen back then. That’s a 34% jump. It’s just a lot.

Priyanka Raghavan 00:10:36 And do you think cloud automations and DevOps has made the secrets problem worse?

Dwayne McDaniel 00:10:42 Well again, that’s where these secrets come from. More infrastructure, more code. The more chances you’re going to hard code a secret, the more confusing you make the process or difficult you make the process for the developer, the more likely they are to say, well look a hard coded secret works uptime is more important than any other consideration. We’re in a hurry here. That is part of it. More code, more problems. That’s what I like to say. The types of things we’re building have also shifted and I think we’re going to get into that a little bit later around AI coding assistance and that. But if you look at the actual infrastructure that we’re building, it is changing. It is MCP servers now it is things like open router, which is a way to access many of the LLMs or looking at infrastructure that is not only just new, but if you look at the config templates, a lot of them just have the hard coded credential baked into it because that’s the easiest way to communicate.

Dwayne McDaniel 00:11:38 This is where the access needs to occur, this is where it needs to be accounted for. Unfortunately, a lot of people that are coding now are new to coding. This isn’t just a new person problem, by the way. I don’t want to blame like this is just new users. It’s people using different tools, different platforms faster than we’ve ever adopted anything else in history. So that is leading to all of this set of mistakes. At the end of the day, these are human mistakes that we could have reviewed the code, we could have used tools to prevent ourselves from committing secrets, but we’re moving too fast for that and unfortunately,

Priyanka Raghavan 00:12:13 Yeah, I think that’s very interesting that you bring up that we have a lot more people touching a lot of these surfaces than before. So, the pace is, yeah, unprecedented. And if I looked at the Verizon data breach from 2025 itself, it said that the credential abuse in phishing contributed to about 38% of the attacks from last year. And one of the things that a lot of people like on Reddit forums et cetera talk about is also these new risks of the AI coding assistance, which you talked about, which is almost fueling a lot more of these leaks, A few days back, in fact, I had a particular case where I was using Claude with like this multiple orchestration and I was pretty much surprised that I had this ENV file, which I had specifically told it was in my gate ignore. But when I was doing this multiple orchestration, it still somehow it figured that the ENV file probably had my credentials, and it spread out to me.

Priyanka Raghavan 00:13:05 And of course then I went through this whole process of rotating my keys, et cetera. But based on this, what really came to my like when you get attacked personally is when you feel it the most, right? Then actually talking about it, it’s what I then realized that we have this new non-human identity risks, which we hadn’t really thought about before. We also have this non-human identity, which is going to be using a lot of our secrets. So, I wanted to find out from you if you called, talk a little bit about AI coding assistance, but also talk a bit about this non-human identities which are using these secrets and OWASP basically having now a top 10 for this risk.

Dwayne McDaniel 00:13:45 Sure. These are highly related things. At the end of the day, all AI are non-human entities. They’re non-human identities so it’s hard to talk about them separately in all reality, I do think it’s important to draw a distinction between deterministic and non-deterministic. The agentic versus non-agent. When we talk about those things, AI coding assistant that just helps you complete a line of text within your coding is different than an AI like that you’ve experienced that spins up another agent and somewhere along the line says, you know what this get ignore file is in my way. We’ll just ignore it. Probably that’s what you wanted because you wanted this end result and that’s the easiest way to get there based on the vectors that I can glue together and see what happens next. And there’s like some kind of hallucination happening along the path to say, you know, get ignore is optional, which is what you’re describing there.

Dwayne McDaniel 00:14:32 But getting back to the numbers from what we’ve seen, we specifically looked at Claude code co-signed commits. There’s an ability to annotate your get signatures with additional signatures. And sometime in very early 2025 Claude code added this ability, and we watched it get rapidly adopted. At the same time we watched the number of secrets being committed by those particular commits spike terribly by August it was almost 4X the baseline, just to put it in perspective, the baseline is 1.5 secrets per thousand commit. Last year I looked it up really quick, it was 1.94 billion commits in GitHub. That’s 5.6% of all public repos contain at least one hard coded secret. When we look at Claude code across the entire year, it was 2.4 x the baseline, but that was a massive spike in August and then they released a newer model, they updated the model and we watched it ebb back down, not go to zero, not cross by the end of the year.

Dwayne McDaniel 00:15:32 It did not cross back underneath the threshold for sure, but it started to converge with the human baseline. Unfortunately, the human baseline was also ticking up. So, I think it’s a little inconclusive to say Claude code makes you more likely to commit a secret. What we think is happening with Claude Code is if you are allowing Claude code to go ahead and make the commit, that’s what we think is happening. And if you are co-signing with Claude code, you’re saying, alright, you’ve made the code Claude, go ahead and just commit it for me. Which means you’re probably not spending as much time looking at what actually gone on means you’re probably not doing the local testing means you’re probably not putting through the same rigor that you would human produced code. Putting the right static checks in place to see like did I go to commit a secret? Recommit hooks? And again, as you pointed out, there are times when Claude could say, yeah, we’re just going to skip that. The get hook might get in my way, let’s end around that. And it, there’s probabilistically that could happen and they gave an agency to go ahead and affect the repo. That’s kind of a perfect storm of yeah, eventually you’re going to do something where you’re just not going to run the test and you’re just going to push it and it’s going to YOLO.

Priyanka Raghavan 00:16:38 Well, I like the way you talked about it. At the end of the day, you also have to think of it as a tool and have the very same set of checks that you had with other tools that you used before.

Priyanka Raghavan 00:17:15 We talked a lot in this previous question about secrets leaking through code. What about logs, telemetry and even say debugging output, which I think can be part of logs. What do you think about that? That’s also a good vector, right? To get the secrets.

Dwayne McDaniel 00:17:33 A lot of people assume because Git is on our name that that is our entire focus and it was back in 2018, we’re here in 2026 now having this conversation. And we have found secrets basically in anything that can contain text. Logs is a great example, especially when we talk about the world of agentic AI, any kind of AI LLM that you talk to is basically a black box. You know what you put in and you know what you got out. Now I’m not saying the entire system is like if you are building a multi-agent system, there’s an ability to put some checks in between those steps, but there’s places where you can logically do it. But the actual, I jokingly refer to it as a random number generator, we don’t know. We know what we put in, we know what we got out.

Dwayne McDaniel 00:18:15 The only way to tune it is to keep track of what did we put in and what did we put out. That’s logs. That’s one way to get logs. I mean we’re talking above and beyond like your Jenkins logs, your CSCD runner logs, all the other logs we drowning in logs. And yes, sometimes secrets do end up there, but they also end up in places deliberately. Jira tickets, Slack messages, someone copy pastes the thing into a secrets dot text file locally screenshots of like seed phrases and you wouldn’t think that, okay, images, that’s got to be safe, right? No, we’re actually seeing recent attacks looking specifically for any images with any kind of telling name. And then that’s OCR it. I think processing wise, since we’re going to be running this stuff on developer machines sooner than later without these attacks that we’re going to like not care about the processing, let’s just look at all the images and see if there’s any valuable information on there. It’s a bit future looking, but we’re already starting to see that happen. So, secrets end up in so many more places than code. It’s very important to start scanning basically anything that has text within your organization and realize that that too is an exposure surface.

Priyanka Raghavan 00:19:24 And while you’re talking about this, I also wanted to ask you about the Kubernetes environments, right? The secrets there when you mount a secret point, I always find that a bit tough like it’s clunky, feel like that could be a potential case for secrets leakage. And if it’s possible for you to talk about some examples where you have secrets leakages from Kubernetes, just as an example, not quoting company names, but just to give us an example of what is a bad case of a Kubernetes leaked secrets?

Dwayne McDaniel 00:19:57 Yeah, without getting into the specifics, there are basically multiple ways to approach Kubernetes secrets. The best is to pull them in only when needed when running and then make sure they’re flush from memory. That takes architecture, that takes some retooling of how you think about building your pods. Because the classic original way that seemed safe enough at the time was let’s make a secrets folder and store them in there. Or let’s pull them in when the pod is built and then they’re just stored there and they’re just going to live in that memory literally forever or until the pod dies, which is enough time to exploit it. So yeah, there’s many cases out there. I don’t want to pick on Kubernetes too much specifically because I’m a big cloud native fan boy out here quite honestly. But yeah, it’s just one of the examples of how we can build infrastructure insecurely by not thinking about these access mechanisms and who else might access them. I think personally with Kubernetes it’s still a bigger problem than even credential is allowed to run its route. It’s confusing to get that security setting just right and if someone gets access to that box, then my goodness, what can’t they do if they’re running its route?

Priyanka Raghavan 00:21:07 Yeah, came over. So, I think one of the things I took away from your answer was irrespective of the infrastructure, whether it’s Kubernetes or anything else, I think it’s a question of doing just in time to prevent secret leakage. So, let’s talk a little bit more when we come onto mitigations about that. And then the other thing I wanted to ask you is also secret risks from leakage, risks from say third party integrations and SaaS tools. We’ve seen a lot of examples. Can you talk a little bit about that?

Dwayne McDaniel 00:21:36 Yeah, the elephants in the room right now, the big one from last year was the Salesforce breach. I actually saw a briefing on this from Cloudflare back at RSA Conference this year where they’re pretty sure it started with one fairly non-technical person finding one SalesLoft credential that was overprivileged, threw it against AI and said, what can I do with this? And then found a bunch of Salesforce passwords, which ended up doing what that did. I’m not going to get into the full details of that breach here, but yes, third parties present this huge challenge in a few different ways. The biggest one is, well if they can get into that and you’re storing your information in there, then wow they’re going to get your information and they’re going to get access to your keys. There’s plenty of examples of this. The Cloudflare Okta leading to the Cloudflare breach a couple years ago, it feels like yesterday but just a couple years ago, which is what I think we’re seeing now with Cisco like this week. That’s why Cisco’s top of mind where it’s literally a third party gets compromised, but they had to have access into your system that ended access into the system was overprivileged gave whoever held that piece of data, that connection string a way to get in realize, hey I can laterally move around. I can live off the land in here and start stealing more credentials and just keep the cycle going.

Priyanka Raghavan 00:22:51 So I was just wondering, Dwayne, if you could build up on that about the Cisco leak for listeners or not aware. And I can also put access to one of the news articles on the show notes, but can you talk a little bit about this? So, this is from the Aqua security Trivy leak, right?

Dwayne McDaniel 00:23:06 To be honest with you, this is an ongoing attack right now. Okay. And anything I would say I would be repeating just what’s in the news out there. But yeah, Cisco’s source code and there’s a group called Shiny Hunters and their TTP, their common tactics out there are to simply steal secrets somehow and then do bad stuff with it. And that’s exactly what we’ve seen. Cisco had over 3 million Salesforce records containing personally identifiable information. GitHub repositories AWS S3 buckets and other internal corporate data have been compromised and open sourced or been released onto the internet or basically anybody else can pick through it. And wow, there’s so much in there that can be done. If you’re listening to this right now and you are a Cisco customer, I would highly recommend rotating everything you have. And I wish I was joking, I wish there was a simpler explanation than the simpler remediation path than that. The people that are going to do well out of that are the people that can say, alright, run the automation script and it just rotates everything because everything’s accounted for in a vault. Or the people that have already moved away from secret-based authentication and moved to, I shouldn’t say away from, because there’s always going to be, at the end of the day, there’s always going to be announced, there’s always going to be some secret keys somewhere. But they’ve moved to just in time very short-lived things that are based on the identities, not the key existing.

Priyanka Raghavan 00:24:26 Okay. And I think it would be good to maybe go over a little bit about what we saw last week, which is the supply chain attack on the Aqua Securities, GitHub action and Python’s LiteLLM package that essentially used malicious info stealer malware to steal a bunch of SSH keys, cloud credentials, talker configs, and even crypto wallets. So, it affected a range of companies. So can you talk a little bit about that for listeners who don’t know about this and I’m obviously going to refer to your Snowball Analysis report. I’ll add that to the show notes.

Dwayne McDaniel 00:25:00 Thank you very much. So, I have a great team shout out to Gian Valadon and Gatan Ferry who are on my team. They’re amazing researchers and they jumped on this immediately and did a deep dive and it culminates in the piece you just mentioned, but we have a couple others like the one specifically about LiteLLM is where we released a tool it’s free to use for any developer on earth to discover all of the secrets on your machine and then put them in one dashboard. Our dashboard, not the secret itself, but like a reference to it and then give you a risk score. You know like hey, what order should I probably deal with this in? But this is unfortunately not a novel attack. It’s novel in the fact that it’s moving so fast and the way that it’s AI augmented and that it’s built on what worked well last year for things like Shi Hu, the singularity attack where the MO of the attack is, let’s just steal more credentials.

Dwayne McDaniel 00:25:49 Then we can turn, use those to automatically infect things around it and automatically steal more credentials. It started as far as we can tell with Aqua, with about a month prior to, so we’re talking about six weeks ago now. There was a stolen credential from a GitHub task runner, I believe it was Task runner, but a GitHub CI process. A GitHub action, sorry, GitHub action credential was stolen. I do believe in my heart of hearts that Aqua thought they had rotated everything. This is very common out there. This is exactly the story of Cloudflare when they thought they had rotated everything, and they simply missed one and then the attackers got in with an old credential. That’s exactly what happened. They were able to compromise Trivy based on the fact they could still affect the CICD. Then it just started spreading from there and it has escalated started Trivy went to the rest of AquaSec. KIX the Kubernetes, I forget what KIX stands for, but from check marks, it’s an IAC tool.

Dwayne McDaniel 00:26:50 Infrastructure as code checking tool. It’s a great tool except for that one particular version that got compromised and that led to those are well used tools, but I don’t think they’re universal. I think they’re loved by the security community. But are they on every developer machine? No. Then we get to the world of LiteLLM, which is a step up in the attack because that is used by so many things. It’s LiteLLM is the open-source framework for connecting together all of these LLM pieces like uh, call multiple LLMs. It’s like your local version of open router. I’m oversimplifying, forgive me for oversimplifying a bit for your audience. But that is massive. And then that led to two days ago, Axios not the new site. It’s interesting that there is a new site but Axios. But Axios is the HTTP library that runs basically under everything.

Dwayne McDaniel 00:27:39 So if you got a package and you implemented and you invoked the Python environment whatsoever, it just ran and now you have a credential stealer on your machine. Everyone that’s listening unfortunately should feel like they are compromised right now. That’s a little tinfoil hat paranoia, but it’s the actual reality on the ground. Assume if you have credentials for a production environment, you should go ahead and rotate those. Now if it’s for your local stuff, it’s for your local like test database. Do whatever you want to do. But if you have anything related to your organization on your work machine right now that could by itself let someone into that environment a secret. Go ahead and consider it compromised and start rotating and again, move away from it if you can. I’m trying to figure out the better way to have that conversation with folks. To be honest with you, like we were good at helping you understand, prioritize the scope of this, but the actual what do you do next is going to depend a lot on how your organization has already dealt with this.

Dwayne McDaniel 00:28:40 If you’re already in the process of rolling out things like ambit or SPIFFE/SPIRE for internal stuff, then absolutely I would move faster in that direction. If you have vaulting systems that you can use, use those. If you are completely without anything, there are free open-source alternatives like Key Pass download it. There’s even a great process. Our open-source project called SOPS, S-O-P-S, I forget what it stands for, but it lets you encrypt files at the line level in place. So, it’s a great way that hey, I can’t get rid of the CNV file, but what I could do is while it’s on my machine not being used, it’s completely locked down in garbage. So, anybody that found it got exfiltrated, it wouldn’t do anybody any good out there. So, I don’t mean to be alarmist, but that’s literally the situation how crazy it’s gotten out there right now.

Priyanka Raghavan 00:29:28 Yeah, in fact we use Axios even in our front end. So, making for all the HTTP Git calls. So, I just upgraded Axios but not rotated the secret. So, I’ll do that after this. So yeah, thanks for that. I also wanted to ask you about in light of these incidents, right, what should organizations do when they analyze these incidents? Are there some common root causes? How can you go about like fixing that? Like one of the things you talked about before was having these just in time secrets and almost changing things architecturally, but are there some, I would say top three items that teams should be looking at when they want to do a root cause analysis?

Dwayne McDaniel 00:30:09 The first is these are well-known indicators of compromise out there. There’s plenty of research being done. There’s some great resources I can point you to. One that I’ve started checking every day of my life is open source malware.com. It sounds like a scary place but it’s really not It’s bot driven thing put together by some brilliant security researchers, Paul McCarty and Jen Gilles. And it’s pretty real time on keeping up with supply chain attacks, keeping up with malware that we’re seeing out there in the universe and we’re watching it as we’re watching it unfold in real time. So, take a quick look. This is where SBOMs come in really handy. In fact, you could probably do some automation to start getting MUL alerts if it’s something on your SBOMs got popped and that, that’s a good way to know. So, I say be 10 foil hadie and like if you’re not sure, just assume that you’re compromised.

Dwayne McDaniel 00:30:56 But if you know we don’t use any of this stuff, like nothing that’s been compromised is in my environment, I would okay feel a little safer but still we need to move toward, get in a vault, get it encrypted as quick as possible and start with the most critical for you. Now that what most critical means is going to depend on you and your governance and like what’s critical for your business. So, if you have never mapped that out, like the priority of if this gets popped, we’re all doomed, then that’s a good bigger step you should be taking from a posture standpoint. But yeah, ultimately, I personally believe that we need to move toward verifiable identity as the thing that we permanently hold. This works for humans fairly well because we live in a world of pass keys. Not to say that you know, humans are flawless at all.

Dwayne McDaniel 00:31:41 We make a lot of mistakes but it’s easy to prove you are you at a certain level, be your thumbprint, be it your retina scan, something that’s uniquely you, that’s verify you now based on that we can start building trust from there. Transferable and transportable trust. With machines it’s actually a little bit easier at the end of the day. It just takes a little bit of rethinking what we’re doing. So instead of standing privilege, you have standing identity, and that identity doesn’t do anything other than prove you are you from cryptographic proof. So, you are running on this Unix socket, you’re communicating over this Unix socket, you are on this stack, you are this user agent you were born at this time. We have proof of all of this. So, if I can carry that with me or the entity can carry it with it, then there’s all sorts of neat things you can do.

Dwayne McDaniel 00:32:26 SPIFFE/SPIRE is again cloud native fan. So that’s where I would send everybody to start. There’s a great free book everybody on the World should read. It’s called Solving the Bottom Turtle. And the name comes from the fact that if you put things in a vault, well you have to have a key to the vault. And then what do you do with that key? Well, you put it in a better vault and then you need a key for that. So, you keep its keys all the way down. It’s only when we move to hey, what’s the thing we know is absolutely true? Oh, you’re you, you can prove you’re you, then we can start thinking about different ways to think about authentication and authorization. So, I could do things like AWS Security token service is one of my favorite things on earth that exist because well now it’s federated.

Dwayne McDaniel 00:33:06 I’ll get to that in a second. But you can take your cryptographically provable identity to this service, and it can do all the checks and say yes, here’s a JWT or JSO web token will last for five minutes. That will let you do that; will let you prove you’re you to another entity. You take it to the other entity and whatever service that a platform that serve whatever it is, does the verification step and say, hey is this real? Can we prove all of this? And it’s only at that point that we start talking about like what’s the intention here? And that’s the other piece. It has to be intent based thinking around identity. What is this identity doing? Why is it here? What’s it supposed to be able to do? It doesn’t get standing privilege. So, we will broker this out. It gets this role if it crosses this threshold from AWS to Azure to specifically do this one thing on this one service.

Dwayne McDaniel 00:33:57 And anyone that tries to hijack that and tries to do anything else is an automatic no. You’re not allowed it. Even if you could get your hand on that token and work fast enough. Which I do believe we’re starting to see with just machine speeds and how fast AI can do this stuff, even if it does that, it’s like okay, you get to a wall like you’re only supposed to read out of this database, you can’t overwrite it. You’re only supposed to read these tables, you cannot read the entire thing. You can only affect this area and limit that blast radius. At the end of the day, attackers are going to innovate and get around things. So, if it’s all about limiting the blast radius as much as possible and the easiest way to do that is simply get rid of standing privilege, get rid of long-lived credentials and move to I think right now anything else.

Priyanka Raghavan 00:34:40 So use machine identities and sort of these whatever credentials, long-lived credentials. And for that is what you’re talking about. SPIFFE/SPIRE, I just looked it up online. SPIFFE stands for Secure Production Identity Framework For Everyone and SPIRE is SPIFFE’S runtime environment, right?

Dwayne McDaniel 00:34:58 Yeah, SPIFFE runtime environment. It’s an open-source implementation and this isn’t the future. This project’s eight years old and it’s based off stuff that Google’s been doing for over a decade. You think Google handles API keys between their servers. They haven’t done that in over a decade. SPIFFE came out of basically open-source people realizing that and saying, wait a minute, what if we just built this? What if we just built this standard out and we’re actually seeing that get interpreted into some really neat stuff in the real world. Like I say, all the platforms now, like all the big ones are supporting this idea of federated identity where my token service can talk to your token service and then we’ll figure out through configuration and allow listing like what you’re allowed to do, what you’re not allowed to do.

Dwayne McDaniel 00:35:39 This actually goes back to the problem of SaaS and what we’re seeing emerge next. So right now, the problem with SaaS and third-party providers is well I need a way for us to authenticate together and the API keys the simplest way to do that. And that’s why there’s so many and there’s so much to clean up. But there standards were SPIFFE works internally very, very well. It works really well for Kubernetes. This idea of federating that outside of your platform, outside of your trust boundary, that’s fairly new. I’ve only seen in that really emerge in the last year where we talk about this platform federation and why STS from AWS specifically is top of mind for me is it was November. November is when I learned about it. But I think they did it in August. That’s when they added federation that hey there’s a way to verify this even if you’re on another platform.

Dwayne McDaniel 00:36:26 There’s a standard emerging from the IETF, Internet Engineering Task Force called Whimsy, workload identity and multi-system environments. And it’s being drafted and it’s the conversation, but the rest of the world is not waiting for the IETF to officially grant and say this is officially the protocol. They’re just building it and thank goodness. But it’s recursive that the people that are building this in the real world, that some very large organizations are also saying, this is how the standard worked for everyone. And it’s like the way HTTPS didn’t use to be common. It was you had HTTP site, right? HPS was like why am I going this extra step that feels unnecessary? I remember living through that. I’m old enough, the audience can’t see me but iron my gray beard. I soon we will say, are UMZ compliant? Like how do I work with you across the standard? And we’re already seeing it. It’s just the standardization of it is lagging a little behind the reality of future looking enterprise. The early adopters are already early adopting and the late comers and the people that are reluctant, they’re going to be dragged into the future here very shortly. And I personally believe the tax, like we’ve seen Trivy like the LiteLLM, this is only going to accelerate that. We need to do something, we need to do something now.

Priyanka Raghavan 00:37:41 So one of the ways that you are seeing engineering teams should future proof their secrets management strategies probably adopting something like Whimsy. I

Dwayne McDaniel 00:37:52 I mean Whimsy would just be the standard underneath it. It’s like saying adopt TCCP IP. Like no, it’s just what we’re going to build everything against like you’re compliant with the standard, therefore I know how to talk to you. That’s already going to be set up. So that’s coming. What engineering team should be doing right now is build a governance plan. I know governance never sounds like the right answer, but again, you don’t buy governance. There’s no product that can a 100% guarantee you grant you governance. It’s got like SOC2 compliance. You can’t buy SOC2 compliance. I mean people try to sell it, but man you cannot buy it. It’s something you have to prove work toward. You have to prove this is how this works. And so that’s what government plans really are is like what state should these things be in?

Dwayne McDaniel 00:38:28 And that starts really with understanding what you have. If you’re sitting there and you’re like, I don’t even know what secrets exist within my org, that’s step one. Have some form of common inventory and that’s across your repos, that’s across your vaults, that’s across your other SaaS platforms. That’s across your providers themselves. Like if you don’t know how many service accounts you have an entra ID, that’s a good place to start. Let’s just start there. What do these things have access to? If you think about it from, why do you need a governance plan? Because eventually auditors are going to knock at your door. If you’re a sizable organization, they’re going to ask, what do you have? What state is it in? Was it compromised? What remediation steps have you taken to fix that? And if you have answers at the ready for all of that, you’re going to look pretty good no matter what happens. Might still be some bad days here and there because you know it’s tech, its reality happens, but you’re going to be in a much, much better position in the longer run.

Dwayne McDaniel 00:39:26 If you just start there, then you can start building out like, all right, well where should secrets live? It’s a fun exercise actually. If you, maybe it’s not fun from everybody but to sit there and think okay, what are my most mission critical systems? How are we storing secrets for that? How should we be storing secrets for that? What’s the process of moving towards something better than that? And if you do that system by system, it’s a big process but just on paper, like you and your team could do that in you know, one meeting and just say, okay, here’s the general plan. Are we all agreeing to this now? That’s how do we get to that governance plan? And that’s going to depend on where you are with your organization, how mature you are with your tooling. And if you’re a small organization, it’s like you don’t think you’re going to be regulated for years. You have the opportunity to say, look, why are we doing the API thing? Why are we making the same mistakes we made for the last 40 years in tech? We’re building brand new stuff with brand new platforms. Let’s figure out a better brand-new way. Not even a brand-new way, but that’s better out a better way to handle this authentication and authorization game.

Priyanka Raghavan 00:40:29 That’s great actually. That really got me thinking. Even for small teams, if you’re building everything from scratch, why not use something that’s been, your latest and greatest, like maybe SPIFFE/SPIRE as you see and build it not with all the old problems.

Dwayne McDaniel 00:40:43 Like my favorite personal story I had from RSA was I was literally having a conversation like this with someone who was building a brand-new platform. It’s a good idea for a SaaS idea. I’m not going to share his name or like what specifically, but he was talking to me about this and it’s like, how do I need your services? And I’m like, if you do it right, you never will. You’ll never need GitGuardian because you’re not going to have any secrets to leak. And he is like, what do you mean? And I sent him the SPIFFE/SPIRE book like literally right then and there I put him up on LinkedIn. Two days later he is like, this is exactly what I needed to know. Like we’re doing this, we’re doing this. Not what I had planned to do, which was all API keys and vaults. I felt that was my biggest win. Again, I see my entire job is help people figure stuff out. I didn’t say it at the beginning of the show but that’s really what drives me and just, hey, here’s what my product does and here’s how you use it is very boring to me. What’s the bigger picture? What are we moving to next and how do we get there? That keeps me going every day.

Priyanka Raghavan 00:41:44 Okay. So, I guess if I have been to ask you one practical piece of advice to software engineers about secrets and reducing “secret sprawl”, what would that be?

Dwayne McDaniel 00:41:56 The number one thing is known what you have. If you don’t have a full inventory, if you don’t know for sure what is on your machine, if you don’t know what for sure is in your systems, there’s no way to defend it. That’s rule number one of threat modeling. Know what you have. That’s never changed, that’s never going to change. The top 10 lists we didn’t really talk about when we didn’t talk about the OWASP’s top 10 for NHI, which is an awesome list. There’s one for LLMs, there’s one for anything under the sun. They’re all great. But OWASP’s top 10 lists are not prescriptive. They are reflections of reality. What are the actual problems we’re running into? Here it is. Can they be solved? Maybe. Will there always be a top 10 list? Yes, that’s how they work. Don’t think of it as like if I do these 10 things then I’m safe.

Dwayne McDaniel 00:42:42 No, like those are the 10 most common mistakes you can make. See it that way and deal with those and you’re going to have a much, much better time. If you look at the top 10 for NHI from OWASP, I personally believe that it breaks down into three buckets. Ownership, like who sunsets this thing? What does long-lived mean? The actual long-lived secrets themselves. Like the technical like piece of data. Is it over permissioned? What state is it in? Is it leaked? And then the technical complexity of all this, the craziest part of like everything I’m talking about moving to these other methodologies, these identity workloads, identity base, they sound complicated at first until you step back and realize the reality of living with a world where we have to account for a secret through essentially obfuscation through encryption. Like if I get into your code and I see that you’re calling vault and I’m determined to get in, I’m going to start looking for that vault key because there’s a vault key and I’m going to start worrying to get in. If I get in and see SPIFFE IDs everywhere or calls to Amazon STS, AWS STS, I’m going to get discouraged because wow, there’s no root key I can just get into now. I’m going to actually have to own the platform in order to do what I wanted to do. Much harder, much more difficult path taking over an entire CA versus just finding a key. That’s the difference in technical complexity. Maybe that doesn’t matter to an AI in the future, but for right now it does. That’s my great hope.

Priyanka Raghavan 00:44:01 I think that’s a great piece of advice. Actually, stepping back and doing a threat modeling exercise or maybe even looking at your architecture, no matter how big your team or product is a great way of actually like finding out what your true assets are and then trying to see the good way to protect them. So, I think that’s a good piece of advice and I think that’s something that will stay with us. So, I’ll definitely add that.

Dwayne McDaniel 00:44:24 There’s actually a free resource I recommend to everyone. Yeah, I think you can buy paper copies of it, but OWASP has a threat modeling exercise called Cornucopia. They sell it as a game. It is not a game. The video makes it extremely clear. This is a threat modeling exercise you’re going to do with your entire team. Now they do gamify it a bit. I shouldn’t say it’s not fun, it’s actually enjoyable to experience but it’s tabletop. It’s literally talking through, if this happens, what do we do? If these two things happen, what’s the higher priority? It’s having that level now that’s across all AppSec, but it’s a really great resource in your hands on. I built a specific secrets version of that and if people reach out to me, we can make those available. It’s called Spot the Secrets and it’s just this is what hard coded secrets look like inside of little pieces of code or snippets.

Dwayne McDaniel 00:45:09 Then we use a UV light to tell you if you’re right or not. I did that for Defcon, for AppSec Village a couple years ago, but it doesn’t really matter if you use mine, you use anybody’s just print something out back doors and breaches is another fun one. But tabletop, the more conversations you have, the more you can refine the plan, and the conversation shouldn’t be in isolation. They should always be in service of the larger governance plan that you’re trying to put in place. And if you don’t have a governance plan, if you’re not talking about governance plans, maybe it’s time to start.

Priyanka Raghavan 00:45:35 That’s really good. And I want to actually ask you a couple more questions in terms of, you know, detection. When organizations discover that the secret is leaked, one of the things you do is of course rotate the secrets. What are the other tools that one could use? Like how do you find out a secret is leaked? Is that the place where I go to open source malware.com?

Dwayne McDaniel 00:45:56 Well, open- source malware is about the ongoing supply chain attacks and like what’s actually going on in these packages. Guardian shameless here, we make a tool that literally gives you that insight where secrets visibility is how I always like to think about it, but if you’re starting from scratch, you’re starting from nowhere. How I started actually working at GitGuardian was I was talking about building with open source, pre-commit hooks, like stop yourself from committing known secrets. Like I started with AWS Secrets, it’s still a good project but it’s AWS specific. There are open source or products out there, better leaks from a Keto now that used to be Git leaks. There are other open-source vendors out there and if you have nothing and you have no budget whatsoever, you can use us for free if you’re an individual developer or try one of these open-source and just run it locally.

Dwayne McDaniel 00:46:40 If you start with just your local machine and you just start, if you just start with just what they were looking for in the Trivy attack or the LiteLLM attack, what specifically they were trying to dump and just start there. It’s a pretty good start starting point. But there are many ways you can do this. At the enterprise level there’s a lot fewer options because you can use graph to scan for patterns. But unless you’re doing the contextual analysis of does this weird look and string grant access, there are so many ways around that. There are so many ways to fool the trackers and shockingly enough, developers just want their code to work. Doing it the right way means it works. Doing it the safe way might not be the right way in their opinion. It should be the same way. The easiest path should be the safest path. And that’s what security teams, if you’re a security person out there listening to this, please work with your teams to understand what they’re doing, why they’re doing it, help them adjust, give them better tools, give them better options than just, hey, don’t hard code your secret. And that’s all the advice you give them. That’s good advice. But wow, you need to give them better paths to not do that.

Priyanka Raghavan 00:48:13 I was also thinking about these sophisticated solutions like EDRs, which is the Endpoint Detection. Would that also help?

Dwayne McDaniel 00:48:20 It depends. It honestly hasn’t because multiple reasons why that’s failing us. One, these are known packages that don’t have vulnerabilities until they do like Trivy. Well trusted, very good system. It wasn’t until 47 or the 48 tags or 48 or the 49 tags got corrupted and by the time we knew that the damage was done. So, if you’re looking for, hey, the CVE is bad, we all agree, don’t run anything with a CVE, but what happens when it didn’t have the CVE yesterday and that’s when it updated. And by the time we made the CVE, it was already a million people affected. That’s why EDR ultimately fails in this. The unfortunate sad part here is if you have credentials on your machine and you are using any kind of build system right now, any kind of package manager, you are susceptible to this kind of attack.

Dwayne McDaniel 00:49:09 If you’re using open VSIX, the extensions for VS code or all the things that use VS code, that ecosystem that is also being actively attacked right now where make sure you’re trusting your sources. Make sure you’re pulling from very reputable places. Pull them directly from vendors if and when possible. But as we just saw with Aqua, maybe that’s not good enough anymore. That’s where players like Chainguard really stand out, in my opinion. Where they do the work, they do the work of rebuilding from scratch and you’re pulling their version, not the Internet’s version. They might pull the corrupted thing into their sandbox, but by the time you saw it, that malware is gone. I mentioned them by name. There’s other people that do that out there. But that’s the reality we’re facing now is there’s no good way to trust a binary anymore.

Dwayne McDaniel 00:49:50 There’s no good way to say we trust you and never be compromised. And that’s why these recent attacks have really hit me as hard as they have. Like personally, this freaks me out a bit. I’m going to be complete open kimono, like complete open chest here. I’m a little bothered by how fast this is all happening and specifically what we’re seeing and what this means for the developer life where I used to just have environment variables on my local and that was how it worked and that’s just, that was safe. And you had to get a remote access into my machine. You had to look over my shoulder and hope that I exposed one of those. And now it is, you trusted a security tool that was keeping you safe up until it didn’t, and now everything on your machine is on the internet, is public.

Priyanka Raghavan 00:50:31 I think that’s a good place to end the show, Dwayne. Thank you for coming on SE Radio. I think you’ve given us a lot of good tips. And before I let you go, I’d have to ask you, where can people find you on cyberspace?

Dwayne McDaniel 00:50:44 I have a website. They actually vibe coded. I’m very proud of it. DwayneMcDaniel.com and that’s got all my links. GitGuardian.com is my place where of employment and I write a lot on our blog, so if you go to blog.GitGuardian.com, you’ll see my face or maybe not immediately. My Face, you’ll see my name associated with a good number of articles. I write a lot of recaps for events. I go to a lot of events. So, if you’re in North America primarily, I go in North America, maybe I’ll see you at a BSides or a DevOps Days or RSA or Identiverse or Defcon. I get around.

Priyanka Raghavan 00:51:16 Thank you once again. This is Priyanka Raghavan for Software Engineering Radio. Thanks for listening.

[End of Audio]

SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

Show Notes

Related Episodes

Other References

Transcript

Join the discussion

More from this show

SE Radio 730: Birgitta Boeckeler on Harness Engineering for AI Agents

SE Radio 729: Garth Mollett on AI Supply Chain Security

SE Radio 728: Clare Liguori on the AWS Strands SDK for AI Agents

Menu

Recent posts

Search

Search

SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

Show Notes

Related Episodes

Other References

Transcript

Join the discussion

More from this show

SE Radio 730: Birgitta Boeckeler on Harness Engineering for AI Agents

SE Radio 729: Garth Mollett on AI Supply Chain Security

SE Radio 728: Clare Liguori on the AWS Strands SDK for AI Agents

Menu

Recent posts