Techstrong TV: GitGuardian Reports Leaked Secrets Doubled
Mackenzie and Charlene discuss the results of the GitGuardian 2022 State of Secrets Sprawl report, which shows a doubling of the number of secrets leaked from GitHub compared to 2020. The video is below followed by a transcript of the conversation.
Charlene O’Hanlon: Hey, everybody. Welcome back to TechStrong TV. I’m Charlene O’Hanlon and I’m here now with Mackenzie Jackson, who is a developer advocate over at GitGuardian.
Mackenzie, thank you so much for taking a few minutes and joining me today. I really do appreciate it.
Mackenzie Jackson: Oh, no problem. Thanks for having me. It’s great to be on TechStrong TV.
O’Hanlon: Great, great, great. So I want to talk to you a little bit about a new report that you came out with, but first I wonder if you could tell us a little bit about GitGuardian.
Jackson: Yeah, sure. So GitGuardian, we’re a French cybersecurity company. We’re actually just in the process of opening up an office in Austin, in Texas, as well.
O’Hanlon: You and everybody else, right?
Jackson: Yeah, exactly, the place to be.
O’Hanlon: Exactly, yeah.
Jackson: But GitGuardian, we specialize in detecting secrets in cyber source code, and when I’m talking about secrets, I’m talking about digital authentication credentials, so API keys, credential pairs using a password, security certificates. So anything that’s kind of made to be used programmatically, that’s what we specialize in detecting. And so we actually scan every single public commit that’s made to GitHub. So every time you push code into github.com, if it’s public, GitGuardian actually scans that.
We scanned it all last year and we alert anyone that may have leaked some secrets, and while we also monitor the private co-repositories of companies to make sure that the secrets, which the crown jewels of any organization that grant access into all the inner-working systems, to make sure that these don’t what we call sprawl, kind of you don’t lose track of them. And making sure that they don’t especially get into the wrong hands of adversaries.
O’Hanlon: So important these days, especially with we keep hearing a lot about secrets, and secret sprawl, and the vulnerabilities that have been found in the Git repositories. But I want to talk to you about the State of Secrets Sprawl report that your company just came out with. It’s the 2022 edition. So is this the first year that you guys have done this or have you done this in years past?
Jackson: So this is the second time we’ve released the Sprawl. So last year, we released it, as well, and we, as I said, because we scan all of github.com’s public commits, we released last year and basically what we found from doing that. And this year, we’ve actually expanded the report. We’ve expanded our analysis to also look into Docker images that are public and also into private repositories, as well. So not the first year, but some new information.
O’Hanlon: All right, so let’s kind of dive into some of the more important findings, or the more salient findings, I should say. What were some of kind of the highlights of the report that you guys want to talk about?
Jackson: Yeah, well, I mean the big number that everyone is curious about is what is the total number of secrets that we found exposed publically throughout the year. So last year throughout, so when we monitored the entire year of 2020, we actually found two million secrets that were leaked into public GitHub. And this year, that number increased to six million.
O’Hanlon: Oh, my gosh.
Jackson: So a huge increase there. So we’re saying that this is actually a kind of 2X increase in secrets. The reason why if you’re wondering why two plus two suddenly equals six, you haven’t been doing math wrong, but because we’ve improved our detection engine, we detect over 350 types of secrets, whereas last year, we only detected 250. And also, the amount of code that is being pushed to github.com has increased. So when you compare apples to apples, it’s about a 2X increase, even though we’ve had a 3X jump in the amount of secrets that were actually uncovered.
O’Hanlon: Got it. I’m gonna leave that up to the experts as far as the math is concerned. I’m just gonna take your word for it. Well, I hesitate to ask you because right now it doesn’t sound like the news is all that good. But what else did you find in the report that kind of made you maybe, I don’t know, think twice or hesitate about the secret sprawl and what you guys are seeing out there?
Jackson: Yeah. Well, there’s a lot in there. How about I share there’s one glimmer of hope that we see in the report.
O’Hanlon: Okay, good.
Jackson: So maybe we start with that. It’s actually for the first time ever, we saw actually a decrease in Amazon AWS keys. This is one of the predominant keys that is listed and it presents a significant risk to organizations. And so why we believe this has happened is that there has been a huge increase in awareness about this particular type of key, and campaigns from Amazon, themselves, as well. So this has kind of created that phenomenon that we can see that education is starting to filter through.
Unfortunately, as we’ve seen that, we’ve also seen lots of increases in other types of keys and we’re seeing new competitors to Amazon like PlanetScale and Supabase. We saw huge increases in the amount of keys that we found for those. So the good part that we can focus on is we have seen some improvement in areas where investment has been made, which means that there is a way to kind of get on top of that, but unfortunately, we have a long way to go.
O’Hanlon: Yeah, it sounds like that was good news tempered by some more bad news, so Mackenzie, you’re kind of killing me here, no. So then with that in mind then, do you think we’re going to continue to see a secret sprawl occur, especially as we see these alternative cloud providers kind of come more into play within organizations that are maybe looking to go cloud first or at least adopt a hybrid cloud environment?
Jackson: So I think that we have a long way to go before we start combatting this and I think what’s important is we need to make sure that detection is available and is ahead of the curve. So one of the things that GitGuardian did this year was implement generic secret detectors so we can capture the keys that might not be so popular yet.
I think that we will see an improvement in areas of leaking secrets in Git. Well, at least this is my optimistic side, but then as is a habit of mine to kind of balance that good news with bad news, we started examining other technologies, so Docker images, and we found that nearly 5 percent, about 4.6 percent, of Docker images contained at least one secret that were on dockerhub.com. Although we’re not winning the war in Git, we have to also factor in that we have all these other technologies that we have to be aware of, too.
So I don’t like to always be doom and gloom. I do like to try and focus on the positives and the positives are that we have the tools available to start to try and combat this. We’re seeing awareness around it. We’re seeing some pockets of improvement and I think if we continue to push in this area then we’ll be able to get it down. But this year certainly didn’t show that trend, but we’re still in the battle.
O’Hanlon: Do you think that some of these numbers may also reflect the fact that we’ve got a completely different kind of workforce these days? People are working from home. They’re not going back to the office, for the most part. They’re working in brand new environments maybe to them where they maybe weren’t used to working from home and they’re kind of getting into the groove now. But do you think that because of the way there was this monumental shift in the way organizations work now that that may have contributed to the problem and now we’re kind of having to go back and kind of clean up the messes, if you will?
Jackson: I definitely can see that that has had an impact and I think we can see that in the data. And I think moving away from those secure networks, going into more cloud environments, it has that impact that now we have to handle additional secrets to be able to interact it. So there’s certainly a lot more secrets that we have to handle.
So that is definitely a consideration and certainly I think we can all agree that the remote work, we’re getting into the groove of it, so hopefully, we’ll start to see some improvements on those areas, as well. But it is one of those things where I don’t think they created the problem, but it certainly didn’t help us solve it.
O’Hanlon: All of a sudden, I’ve got a Billy Joel song, We Didn’t Start the Fire, going through my mind. Oh, anyway. So were there other findings in the report that, well, actually maybe just didn’t change too much from the last year that either for better or for worse?
Jackson: Yeah, well, I mean, we did see kind of similarities in kind of where the leaks were coming from. There were some countries that made improvements, so Brazil was one of the countries leaked a lot of secrets last year that actually moved down today. But this data also reflects the number of developers in these countries, as well, so it doesn’t necessarily mean that there’s significant improvement or that particular countries are bad.
But one of the areas that we really focused on this year is looking at internal repositories, those private repositories, because GitGuardian scans a lot of organizations’ code for secrets and we’re able to gather that information. And this information we didn’t have last year, and I guess that we’re probably most shocked at kind of the occurrences of secrets that we found in repositories. And then we combined that with the amount of Absec engineers that are out there and you realize that this is just a monumental task.
A typical organization of 400 developers may have 4 Absec engineers, and within a year, they will be finding over 1,000 secrets, based on our analysis. That’s a lot of secrets that 4 people have to review, communicate where the leak is, discover what it is, revoke it, create new keys, insert it. They’re having to do this multiple, multiple times a day on top of all the other vulnerabilities they need to deal with.
So we didn’t have this information last year and I think this was kind of one of the ones that really made us stop and go wow. No wonder the problem is bad because we don’t have the security teams in place. They don’t have the bandwidth to be able to kind of deal with this correctly. So we have to start thinking about a shift in how we build software and how we work to be able to actually start coming up with some solutions for this.
O’Hanlon: You actually bring up a really good point and I think it speaks a lot to the larger issue that a lot of organizations facing and that’s quite literally just a lack of talent. They can’t find the right people to fill these positions. And through whatever reasons, it seems like we’re always going to be one step behind when it comes to having enough people to fill the spots and to do the work that is necessary to make sure that the code is scrubbed, and it’s clean, and it’s secure, and that once we get that, then everything else after that obviously will be a lot easier, if you will, for security. ‘Cause the farther left you shift, the easier that it becomes to secure the application.
So it’s really fascinating to hear all these different results that you guys have found in this year’s survey. And before we close it out, because we are running out of time, is there one finding that you really kind of glommed onto? As soon as you found it, you’re, “Oh, yes, this makes so much sense,” or maybe not?
Jackson: Well, I think it is that correlation that we just discussed about the amount of secrets and Absec engineers, and I think that if we can do a takeaway of how to solve this, we talked about you just mentioned it, shifting left. And I think it’s about we need to empower developers early on in the process to be able to kind of help them solve this problem so it doesn’t even reach down the line at the Absec engineers.
So I think we need to really focus on that education part, tools, being able to scan the secrets using Git hooks before they enter repositories, and having that whole process shift left. And I think this is a number one thing that we all kind of realized in this report is that it’s absolutely fundamentally important that everyone in the software supply chain from the developers all the way down to the deployment teams is onboard with security and plays their part even if it’s just implementing the correct solutions for them.
O’Hanlon: Yeah. So if folks are interested in taking a look at the survey themselves, is it readily available on the GitGuardian website?
Jackson: It’s readily available on the GitGuardian website. Also on our blog there. We’re advertising it everywhere, so I’m sure you won’t be able to miss it if you go to our website. Yeah, definitely.
O’Hanlon: Excellent, excellent. Well, Mackenzie, thank you very much for walking me through some of the findings of the survey, both fascinating and slightly terrifying, so thank you very much. I appreciate your time.
Jackson: Yeah, thanks for having me. It’s been great.
O’Hanlon: You bet. All right, everybody, please stick around. We’ve got lots more TechStrong TV coming up, so stay tuned.