Episode 76

Discovering and Stealing Secrets with Mackenzie Jackson

EPISODE SUMMARY

Mackenzie Jackson, developer advocate at GitGuardian, joins Delinea’s Joseph Carson to discuss how breaches can be avoided by securing source code. GitGuardian is the code security platform for the DevOps generation, helping to prevent secrets from hiding in source code. Jackson explains how attackers exploit these vulnerabilities, and how developers can proactively prevent attacks in this episode.

Watch the video or scroll down to listen to the podcast:

Subscribe or listen now:

Meet the Podcaster
Full Transcript

Joseph Carson

Joseph is Chief Security Scientist and Advisory CISO at Delinea, an active member of the cybersecurity community, and a frequent speaker at cybersecurity events globally. He has 25+ years’ experience in Enterprise Security & Infrastructure and is a Certified Information Systems Security Professional (CISSP). Joe is also an adviser to several governments and cybersecurity conferences. (ISC)² Information Security Leadership Award (ISLA:registered®) Americas Winner 2018.

Hello from Cybrary and Delinea, and welcome to the show. If you've been enjoying the Cybrary Podcast or 401 Access Denied, make sure to like, follow and, subscribe so that you don't miss any future episodes. We'd love to hear from you. Join the discussion by leaving us a comment or a view on your platform of choice or emailing us at Podcast@Cybrary.it. From all of us at Cybrary and Delinea, thank you and enjoy the show.

Joseph Carson:

Hello everyone, welcome back to another episode of the 401 Access Denied Podcast. I'm the host, Joe Carson. It's a pleasure to be here with you. And it's so exciting, I really enjoy having the amazing guests on the show, really makes a difference and I'm again joined by another amazing guest, somebody I've met quite a number of years ago already. So welcome to the show today. Welcome, Mackenzie. Tell us a little bit about yourself, what you do, and even some things that maybe people don't even know about you.

Mackenzie Jackson:

All right, yeah, great to be here. Thanks for inviting me, Joe. Yeah, so Mackenzie is my name, I'm the Security Developer Advocate at GitGuardian, which is a security vendor. Before that, I was the co-founder and CTO of our company in Australia called Compargo which still exists today. And yeah, I mean something that most people don't know about me, I grew up in the circus, so I have extreme hippie parents, we were always traveling round and I decided to rebel and get into IT and tech and wear suits.

Joseph Carson:

Another circus. You joined a different type of circus. We joined the technology circus because sometimes that's what it feels like.

Mackenzie Jackson:

That's for sure.

Joseph Carson:

But it's great to have you on the episode today. I'm really excited about today's conversation because it's something I think... In previous episodes we've had a few people touch on it. We've had Dustin Heywood, also known as EvilMog, and we had Carlos Pullup and we touched on a little bit, which is all about secrets and one of the big themes, today's episode is all about, where are attackers looking for secrets and what are secrets and how are we discovering them and how can they abuse them? And ultimately, of course, secrets can be a form of many things. It can be passwords, it can be tokens, it can be keys, it can be simply just hardcoded passwords and passphrases. So secrets is basically that top terminology for everything, even pins, et cetera. So one of the things, this is something that you specialize and look into and spend a lot of time on, where are the attackers looking for and discovering secrets today? What's the places that they're finding them?

Mackenzie Jackson:

Yeah, well, I mean just about everywhere. I mean, a good place to start is the OS top 10. When you look at the OS top 10, number one is broken access control. And within that it's leaked credentials or unprotected credentials, which is basically talking about secrets. So where we find them, it really depends and it depends on what the attacker is trying to do. Because secrets are very unique. They could be used in so many different parts of an attack, so they can be your initial access. So if we talk about initial access, how do attackers find secrets initially and kind of break into a organization? Well, there's a lot of places.

At GitGuardian, we recently just released our State of Secret Sprawl report. Now in this we scanned public GitHub repositories and we scanned every single commit that's made throughout the year. So in 2022, we scanned over a billion commits. Of those billion commits, we found 10 million secrets. Now bearing in mind, this is just in public repositories. And these are things like cloud provider keys, they're database access, they're API keys. And we can actually verify a lot of these. So we check with the provider, "Hey, is this a valid key?" So we can be pretty sure that that number, 10 million, this isn't just kind of a random strings that look like secrets, these are pretty accurate.

So going back to that initial access point, let's look at public GitHub's initial access. Last year a company called Toyota, may have heard of it, they have an application called T-Connect and they had a consultant working on this application. Five years ago, they accidentally open sourced their source code for this, they had accident in a repository. There was a secret inside there that gave access to the entire databases of all T-Connect users. So this is an example of an account attacker found a key in a public space, used it to move into a system.

Another example, we also scan for secrets in places like Docker Hub, Docker Images, about 5% of Docker images contain an extractable plain tech secret. A few years ago, you may remember there was a big breach of Codecov, a supply chain attack, how that all started and all these companies were breached as a result of that, including massive ones, Rapid7, Twilio, how all that started? A leaked credential in a public Docker image on Codecovs.

Joseph Carson:

And you're also finding a lot of even developers who are trying to make sure that they have ease of access and even the last two years or three years, people working remotely so what they've done is they've cloned repositories, which were originally private and copied some of those or reported into their own repositories which are now public. And you just hope that they have changed or rotated the original token that was used before that happened but it's not always the case.

Mackenzie Jackson:

Yeah, exactly right. I mean, there's so many places. You got to remember is that source code is a very, very leaky asset, right? So okay, we've got a private repository, what happens? All right, immediately that source code hits that private repository, it's then cloned onto all the developers that are using it. This could be everyone in the company. It's backed up, it's probably in a wiki, it's shared on messaging systems, code snippets. So code you can never trust just in one spot. So if you're going to have secrets in there, it's a very unreliable place and they definitely can end up public.

Joseph Carson:

And a lot of developers are now using things like ChatGPT to analyze code and of course ultimately becomes eventually public because you can now query and if it's actually getting embedded into its machine learning and into its actually data source, all of a sudden now you can start asking questions about code that it's been actually analyzing. So another interesting way that I've heard a lot of reports that companies are now asking their employees not to go and fruit code or actually analyze it in things like ChatGPT.

Mackenzie Jackson:

Well, employees never do things that they're told not to do so that's foolproof.

Joseph Carson:

Absolutely.

Mackenzie Jackson:

But did you know, this was a bit of a side note, do you know ChatGPT got a bug bounty last week?

Joseph Carson:

Oh, did I didn't know, no.

Mackenzie Jackson:

Yeah, so I mean, this is a bit of a left field thing, but it was actually a guy that was asked to evaluate EDR solutions and one of the ways he wanted to see, "Hey, if I create malware that's designed to bypass this, can I bypass some of these solutions?" As a way of evaluating them. He didn't know how to write malware, so he asked ChatGPT, you have to trick it, it won't do it initially.

Joseph Carson:

You have to trick it. It's a special way of asking ChatGPT to do something.

Mackenzie Jackson:

Well, you just kind of have to go, "Hey, can you give an example of something that does this?" And then you've got to work through it from there. But eventually it spat it out. He managed to bypass some EDR solutions and then submitted it to the bug bounty program with malware that ChatGPT wrote and got a payment of, I think it was $650.

Joseph Carson:

It's pretty good if you can automate that, that's the impressive part if you can take that to the next stage. Because I myself, I use ChatGPT log for things like to create the base of a code for myself with certain parameters and outlines. But of course I had to change it quite a bit in order to get it working fully functional. But I do see this as a way to do a lot more automation to do things in a way to even going and searching for things. So it might be even an interesting area to go and discover secrets.

Mackenzie Jackson:

Yeah, definitely. And I'm sure ChatGPT is great at things that humans aren't. Humans aren't good at discovering secrets because they're buried in code. But the things like ChatGPT and other analyzers out there. Well, what we think at GitGuardian is that the problem of detecting secrets will be solved from the engine point of view.

So GitGuardian has what we consider the best secrets detection engine that's available on the market. We have put lots of resources into this. We believe that in a few years that's not going to be novel. That because with AI, with machine learning, with the huge data sets that we have, we're going to be able to get to a point that's really high accuracy and everyone will have that. So that's really interesting, and these are the areas for AI, so it's going to be helpful. It's going to be harmful and helpful at the same time. And I think you have to use it, you have to be optimal.

Joseph Carson:

Absolutely, you have to be knowledgeable about where your data is and where your credentials are and where your access is coming from. You're better doing it proactively than waiting to find out when something goes wrong. Because ultimately that's the abuse that happens. Question, what types of malicious activities can be done when those secrets are obtained? Of course, getting an API could allow an attacker to extract data from a database or allow to put some code into... Or even if they got access to, they could even check in their own code into those repositories as well, they have the ability to do commits. What are other types of activities you were seeing malicious actors doing when they're able to discover secrets?

Mackenzie Jackson:

Definitely, there's a couple of ways to kind of go in here. An interesting experiment that I've done quite recently is leaking a honey token into GitHub, which is a token that will give me information about people that have been exploiting it. So we can basically watch what hackers do. And what they do when they find a credential is they automate through a process of like, "Hey, is this valid?" That's number one. And then they're going to go through, so it's like an Amazon key, "Does this have access to S3 buckets? Does this have access to that? Can I create a user?"

So regardless of the type of secret, when an attacker finds something, they're going to be doing a couple of things initially. So they're going to do reconnaissance of what does this give me access to? And then they're going to move to, how can I persist my access? So if this secret gives me access to, let's say a code repository, can I find more secrets in that code repository? Can I create a new user for myself? Can I do all these different things because at the moment, I'm relying on this key that grants me access into something.

Joseph Carson:

That could be key to these differences.

Mackenzie Jackson:

How can I move in time? Yeah, how can I move into a different system? How can I ensure that my access is going to persist? How can I elevate my privileges? So that's what what's going to happen. And this kind of brings us off into the other area of attacks where let's say that a secret is not the initial point of access, then once an attacker is in your network, in your code, in a backup file, in somewhere, then one of the first things they're going to do is enumerate through all your data to try and find more secrets. Why? Because when you have a secret, you're correctly authenticated. If you break down a door, you're going to set off an alarm. But if you have a key for that door, then no one's going to be suspicious. It's exactly the same in IT. You are working within the expected parameters, so therefore no one really knows that you are in there so you can squat, you can kind of do your reconnaissance.

Joseph Carson:

The stealthy scenario, absolutely. I will say it's probably the attackers prefer to be able to live off the land and reuse your credentials and access because the moment they create anything new, they create ripples in the water and you're always looking for that noise. And the most stealthiest methods are the preferred methods, and it allows them to typically stay around for a much longer period of time so they can do a lot more reconnaissance and look for a lot more sensitive data and to learn more and to look for ways to elevate. So definitely keys, which would basically allow the attacker to come in under the disguise of an application or a system or an access that they would typically see commonly. And they're disguising sometimes even the path and the route into those organizations. So absolutely, this is the most preferred method, I believe, from attackers because it allows them to remain undetected for as long as they possibly can.

Mackenzie Jackson:

Yeah, exactly. And secrets are just a fantastic way of being able to remain undetected, especially if you don't have the additional elements of zero trust set up so you haven't restricted IP scopes or there's other types of authentication going on in the background, which when we look at security, we have all these elements like zero trust, and they're all great and when you put them in, like multifactor authentication, but the reality is the standard is so far from the bleeding edge of what we have is that people always say, "Well, isn't this a solved problem?" It's like, well, it could almost be a solved problem if we correctly utilized everything and if every organization had an unlimited security fund, then yes, we have the technology to be able to do it but just the reality isn't there yet.

Joseph Carson:

Yeah, no, it was interesting. So recently I did a webinar with RSA that was just a recap of my talk from RSA last year, which it was all about ransomware. And one of the questions that the audience had was that, what would be some of the steps and methods that maybe this organization who became a victim of a major ransomware attack could have prevented it? And ultimately what would happen was when you go through all of the forensics and the logs and everything, one of the biggest things was that initial access. And the initial access itself, it came from an authenticated credential and that credential, we still don't know how that credential was done, whether it was through password reuse or whether it was through credential compromise or phishing, there's many different ways that the attackers can gain access and discover those secrets.

But ultimately, when you look at the logs in that first ever initial access, so for example, the accountant who was accessing this machine whose credentials were compromised, they were coming all the time from a known IP address. The IP address was always the same roughly all the time accessing the server. And then all of a sudden, the malicious actor had the access to the credentials and they accessed the same RDP from a known tor exit node, which is basically known to have a malicious reputation. And then a couple months later same, they verified the credentials, they double checked it, made sure it was still working, made sure it was still something of course they were likely an access broker looking to sell on that access later. But again, another IP address that was coming from a node that was known to be malicious.

And the question is that if the organization had to been looking through those, there is technology that looks at the reputation of IP addresses for suspicious, even just to get notification on it, would give them at least something. But your point is that all these technologies are out there, why we haven't adopted many of them, why are organizations still continually following to these sometimes simple types of attacks? And what things can they do to prevent them? What's some of the better practices? That doesn't mean they have to have every single technology under the sun, because I think when security budget is more than what the revenue, profit of the organization makes, it's not going to happen.

Mackenzie Jackson:

No, that's exactly right. And it's such a difficult equation because on one side you're like, "Hey, there's all these tools that could prevent it." And the other side is like, "Yes, but we're in inundated by security notifications as it is. We don't have the team to be able to investigate it."

But if we look at secrets, it's like, what are the most common ways that these secrets actually leak out? The number one where it happens is through source code. And we're generally talking about machine to machine secrets rather than human secrets. Because if we're talking about human secrets, then the answer is phishing campaigns. But we're talking about technology, it's source code, it's the leakiest asset, and we have private source code, and I've talked a lot today about public spaces, but private source code is much scarier than public. The reason being is that we're comforted behind this very flimsy layer of authentication when it comes to source code.

Now, the problem isn't really authentication here because source code isn't designed to be sensitive. And there's a case that it really shouldn't matter too much if your source code is open sourced. But we have behind this level authentication, but so many people need to access it, so many people have access to it. It's really hard to lock it down and I wouldn't really recommend it. So one thing we need to do is we need to make sure that source code doesn't have secrets in it.

Joseph Carson:

Yeah, or static, anything that's persistent, shoot, that's definitely one of the main areas is to get rid of that.

Mackenzie Jackson:

Absolutely. And it's super easy for it to happen because people go, "Oh no, we have code reviews, we check everything. Now our secret wouldn't end up." But it's much, much more common than that. Because imagine now you have a developer working on a development branch and he's just trying to quickly get some database connected, something to work, hard codes. Yeah, we're going to remove it later. Hard code as it gets it working. Yeah, they've removed it. 30 commits later that feature, that segment is now ready to merge into the main branch. Code review happens, you're not going to review every draft someone and every mistake someone made along the way, no. You're just going to review the very latest version. It gets pulled in, no one sees a secret. It's in the history.

Got to remember that most security vulnerabilities, they exist in the final version of the code. If I have cross-site scripting vulnerability in my code, I fix it, I update it, that doesn't exist anymore. Secrets persist for the lifetime of that secret is valid. If it's in the history and it's valid, it's a risk. So we need to make sure that secrets are not in there. How we do this, there's no magic bullet. There's education that we need to make sure that developers know not to do this ever and how it works and why it's a risk, need to be using Secrets Managers.

And something about Secrets Managers is there's some great products out there. HashiCorp in my opinion, is probably one of the best tools for managing secrets. But if you're a team of 10, it's not going to be the tool for you because it's too heavy. So you have to try and find what is the correct tool for you to use that your team is actually going to use, and that you have resources in place to manage it correctly? Because maybe it's not going to be something like Vault that you're self hosting.

And then of course, scanning for these secrets yourself, you need to be notified when these secrets enter a new source code, they're going to enter in. You can put in something simple like a git hook on your developer's machines. So if your commit has a secret in it, it gets blocked, it doesn't enter the source code. So there's these things that we can do. And a cool resource that we wrote is The Secrets Management Maturity Model. It's basically just a survey that you can take to see, "Hey, where whereabouts are you in this? How much of a risk is your organization having at leaking out these secrets?" And I think there's 20 questions or something about... Well, maybe less than that. And it will give you, what is your maturity level in managing secrets?

Joseph Carson:

What's your current state in that level, what you're doing and what you're not doing and what you should be doing.

Mackenzie Jackson:

Exactly, yeah. And what areas, because it's not like you are either good or bad, right? It's like, "Okay this is a good area and this is..."

Joseph Carson:

Question from a lot of the research and stuff that you've been doing, what about a lot of organizations, developers are not always direct employees, they're third party sources, they're external, they're third suppliers, is that a major risk that you're seeing in organizations? One basically just getting code from third party sources.

Mackenzie Jackson:

Yeah, huge risk. So in the managed service provider avenue... So well, I talked about Toyota before so that was from a service provider, that wasn't theirs. And this is why this gets really interesting for lots of reasons. Organizations don't have visibility often into their consultants, their MSPs and what they're doing. So when it comes to leaks on GitHub and areas like that, this all happens outside of your visibility, all of it. It could be an employee from an MSP that's pushed something public that you don't know about.

So it's a big risk to be able to do it. So we need to be careful in the providers that we're choosing and that they actually accept that this is a risk. Because if someone says to you, "Oh no, that's not a risk for us, we wouldn't do that," then that's the biggest risk that you have is definitely that. So just making sure that we are all understanding that this is a risk, and then getting visibility into kind of mapping out your perimeter and your risk factors, your attack surface, and including things that are outside of your control, you need to be able to be aware of that.

Joseph Carson:

Absolutely, you got to make sure that, well, while you might have good practices in place for your own developers, I've seen a lot of developers copy paste and sometimes when you're just copying code with secrets into your own code repository secrets, and it's not always a good way to get things done. So what are some of the best places for organizations to start, and what would be the first kind of stepping point of being able to go down the path of getting in control of secrets?

Mackenzie Jackson:

Yeah, so first thing that you need to do is understand your risk. So look and scan your infrastructure so your source code, your networks, other areas. There's vendors that do that, GitGuardian, but there's also open source tools if you don't have budget for it, that where you're going to need more advanced tools is when you get really big as an organization. So understand your vulnerabilities, so scan your repositories, your networks, and get visibility into it. It's going to be shocking. We typically find that the average organization size of 400 developers, so still quite small, so about 400 developers, maybe a thousand employees, they'll typically have around about a thousand secrets in their republic, a thousand unique secrets, 13,000 total secrets is typically what we find. So you kind need to get visibility over that.

And then there becomes this "Oh crap," moment because it's kind of like now that we're aware of the problem, we have to do something about it. So you need to put in place good remediation. Then you're going to need to stop the bleeding. So why did this happen? When you get visibility, you can say, "Hey, these are the teams that are leaking. Why are they leaking? Maybe we are not using the correct secrets manager. Maybe we're not using the right amount of tools. Maybe how we're we're sharing these secrets is through Slack," whatever it is. Once you have visibility, you can plug your holes and then you need to layer on backups. So once we've got visibility, we have correct tooling in place to be able to manage secrets, we need to be able to monitor in real time when a leak happens and remediate the leaks that have happened.

And remediating, I mean, it sounds like, "Hey, if you find the secrets, we can solve it," remediation, especially at scale is probably one of the most difficult things because you're involving multiple parties, so you really need to look into the remediation process of it. But that's basically the steps that you can take to try and identify and fix the system. Identify the vulnerabilities, use the correct tools to be able to find it, and then remediate and monitor in real time when they happen, because they will continue to happen no matter what you do. But we just need to be aware of that.

Joseph Carson:

Yeah, you absolutely need to be aware of the risk and then make sure you've got the tools in place that'll definitely allow you to monitoring that type of thing. One of the things that we've been doing even at Delinea is around the ability to make sure that you're replacing those hard coding tokens with basically on demand temporary tokens that allow you to get access and you check out those tokens temporarily in order to perform an action. So it's definitely a way to make sure, and it gets you away from that static persistent secrets which is what the major challenge is that when those secrets are out there are persistent, and you would need to change everything in order to remediate it, moving to more temporary base keys that has additional security controls and monitoring in place will definitely give at least organizations much better visibility and know what things are going to do in order to reduce the risk as well. So absolutely.

Mackenzie Jackson:

Dynamic Secrets are amazing. There's a few vendors that do that now. Vault's been doing it for a while, but I also think Akeyless, which is another secret manager, has features like this. I really think that this is got to be... It doesn't solve the issue completely because we have long life secrets behind short life secrets, but I mean like, hey, before we can solve it, we need to get better. And this is a big step towards it, absolutely.

Joseph Carson:

It's very cool. Yeah, definitely allows you to move away from having to change everything at the edge, everything in the code where you actually have a little bit more centralization and visibility into how those secrets have been used, when they're being rotated, how the machines have actually been, and code has been actually communicating with it. So it gives you much better visibility. And it's also one of those things where at least I think the big challenge here is not only getting it from the social, but also moving in into DevSecOps and DevOps, who's getting into configuration files? That's all about basically doing code on demand, basically serverless and codeless infrastructures. And it all ends up being in configuration files as well. So we have to make sure that not only is it just about in the developers, but also in the operational side when developers are actually getting the code in the production.

Mackenzie Jackson:

Yeah, yeah, and you bring up some great points there. And the one thing that I will say that I didn't mention before is even if you've not got Dynamic Secrets, have a plan in place to rotate secrets regularly, because this actually does a couple things. One, it means that the secrets, obviously, if they do get leaked in code, they will be rotated and worthless soon. But it also means that you actually know how to do it so that if one does leak publicly, then you're like, "Oh, we know what this key does. We know how to rotate it." Because nothing worse than if a key leaks, it's like, "Hey, what does this key do? We have no idea." And like, "Okay, should we rotate it and we're going to break everything?" That happens.

Joseph Carson:

That's the scary part. That's the scary part for organizations. And that's sometimes why they don't rotate keys is that they don't know what they're going to break, they don't know what the relationship exactly or the functionality of harmony. Does that key need to be the same in different places? So that's one of the things actually in Delinea as well, what we've done as well as dependency mapping of services, which use keys to communicate with different resources, and then making sure that you have the ability to rotate them all in the right time in the right places. Because I came from my background many years ago was in backup and recovery and backup solutions and when that key wasn't right in all of the right places, your backups weren't working, you couldn't recover. Reports started the failure, and people were afraid to rotate those credentials. But once you have visibility into the dependency of those, you can do it much more, let's say, calmly and safely without worrying that you're going to break the application very quickly.

Mackenzie Jackson:

Yeah, definitely, definitely.

Joseph Carson:

So it's been fantastic having you on the episode today, and I think it's really important that kind of people get more visibility into the importance of keys and secrets and where they can be discovered, what things they can do. Any final words of wisdom that you want to share with the audience into any good resources that they can go to? You mentioned the report that you'd released, where would be a good place for them to get the reports and take a look at that?

Mackenzie Jackson:

Definitely, GitGuardian does a lot of research into this topic, writes a lot of white papers. So gitguardian.com is where we have a great resources page. The blog.gitguardian.com is also great, we publish every day on our blog or just about with news and topics and trends and tutorials and everything. Again, it's quite vendor agnostic in there. So they're great resources to check out. And I think that will be my final words of wisdom will be to try and just educate yourself around this problem because it is growing and unfortunately it's affecting everyone in the industry.

Joseph Carson:

Okay, absolutely, very wise. And we'll make sure to put them in the show notes so that everyone can get easy access to them. And the last question I've got is that, do you go back to the circus these days or not? Or are you afraid of circuses or what?

Mackenzie Jackson:

I used to go do so many fun tricks, and as I've gotten older, and these tricks only really come out after a few beers, but I've really started injuring myself. So I try avoid all my old circus tricks these days. When I was 13, your bones were bendy, but now I don't think they are. But it would be cool to try and to brush up on some of the tricks again, maybe I'll try it.

Joseph Carson:

I'll try and put you to the test for that, we'll see.

Mackenzie Jackson:

For sure.

Joseph Carson:

But it's fantastic having you on the show and really fantastic. And for the audience, hopefully you've got a better idea around the challenges that many organizations have around secrets that are kept in a lot of source code and repositories, whether it being private or public, whether it being synced and backed up on employees and developers' devices. So it's really important that you get a visibility and get started on taking a look and how you can manage and reduce the risk today. So Mackenzie, it's been fantastic having you on the show today. Thank you. Hopefully we'll be able to have you on back again in the future. And for the audience, this is the 401 Access Denied Podcast. I'm Joe Carson, the host of the show. It's been a pleasure. Again, tune in every two weeks, even go back and take a look at previous episodes if you want to find out more about some of the previous guests that's been on the show. So thank you, stay safe and take care. Goodbye.

Discovering and Stealing Secrets with Mackenzie Jackson

EPISODE SUMMARY

Other episodes you might like

The State of Information Wars with Dan Lohrmann

Fighting Cybercrime and Tracking Malware Trends with Shyam Sundar Ramaswami

Ransomware Trends and Emerging Threats with Dan Lohrmann