Home » Security Bloggers Network » Open Source Security Podcast Ep. 151– The DARPA Cyber Grand Challenge with David Brumley
Open Source Security Podcast Ep. 151– The DARPA Cyber Grand Challenge with David Brumley
Open Source Security Podcast helps listeners better understand security topics of the day. Hosted by Kurt Seifried and Josh Bressers, the pair covers a wide range of topics including IoT, application security, operational security, cloud, devops, and security news of the day.
On June 23, 2019, Dr. David Brumley, CEO of ForAllSecure and CMU Professor, sat down with hosts Josh Bressers and Kurt Seifried to discuss the future of automated security. Brumley discusses the Cyber Grand Challenge and reveals how it serves as a fascinating window into the future of the application security testing industry. The 30 minute podcast is available for listening below. The full transcript is also available below.
Kurt Seifried: 00:06 Hello and welcome to the Open Source Security Podcast episode 151 with myself, Kurt Seifried, and my partner in thoughtcrime, Josh Bressers.
Josh Bressers: 00:13 Awesome, thanks Kurt. Today I’m super excited. We have David Brumley who is the CEO of ForAllSecure, and he is a professor at Carnegie Mellon University. Why don’t you say hello, David?
David Brumley: 00:24 Hi, everybody.
Josh Bressers: 00:25 Awesome. I’m so excited to have you here because your team won the DARPA Grand Challenge a couple of years back which is so cool. Before I start getting all crazy on you, why don’t you introduce yourself properly? Tell us a little bit about yourself and then we can talk about why you’re here.
David Brumley: 00:41 Absolutely. I’m CEO of ForAllSecure. What ForAllSecure’s trying to do is bring actionable security testing through the full appsec lifecycle. The reason this has been our mission goes back to what you just mentioned: the CGC. For years, we’ve been researching ways we can better identify exploitable vulnerabilities in software. Part of what the CGC challenged us to do is not only identify vulnerabilities, but repair them. Then, decide when to field fixes. It was the full application lifecycle from, “Can you find a problem? Can you fix it? And, can you make sure that that fix, when you actually deploy it, does what it should?”
Josh Bressers: 01:13 All right. Can I ask what CGC is? You’re obviously in the government space which means you love your acronyms. So, I’m going to keep interrupting you and making you define them.
David Brumley: 01:24 Awesome, yeah. In 2014, DARPA announced a Cyber Grand Challenge. So DARPA, the Defense Advanced Research Project Agency, are the people who originally funded the internet.
Josh Bressers: 01:33 That’s right.
David Brumley: 01:34 Every so often they’ll issue a Grand Challenge. The one that comes most to mind for people are the autonomous vehicle challenges they had. Where they said, “Hey, can we build a self-driving car?” In 2014, they issued a similar question, “Hey, can we build self-driving app security?” What they meant by that is, “Can you come up with a fully autonomous system, no humans allowed?” Running on these big, high-performance computers where their job is to not only ingest applications, run them, and provide high up-time, but also find exploitable bugs, break into opponent systems to cause downtime, and defend from attacks.
David Brumley: 02:09 It was actually kind of genius, right? Because for a long time, people had been talking about security as a binary value. You’re either secure or insecure. What DARPA did is make it a lot more realistic. They said, “It’s not a binary value. What you’re trying to do is win. You’re trying to beat the attacker, not be the most secure system.” I think that’s a lot more definable.
Josh Bressers: 02:28 That’s awesome. I guess the first question that everyone wants to ask is, “Is this how The Terminator movies start?”
David Brumley: 02:37 I think it is.
Kurt Seifried: 02:41 Remember how excited we were for the DARPA autonomous vehicle challenges? Those were only 15 years ago. Now, autonomous vehicles driving themselves around? No big deal. It’s a thing, you know? It’s not ready for primetime, but another 10, 20 years, it’s no longer a question of, “Can we do it?” It’s just a question of, “Okay, we’ve got to polish the edges.” So, do you think the Cyber Grand Challenge might come to normalcy on the same timeframe?
David Brumley: 03:10 I think so. I think the timeframe you laid out, 10-15 years is probably about the right one. We’re here, two years after the contest, trying to commercialize some of the technology, but definitely not everything that was demonstrated at the CGC. It’s going to be a matter of watching it mature. Ultimately, the Cyber Grand Challenge and DARPA set the ultimate vision. What we’re going to see is, in 15 years, that’s a product you’re going to be able to buy. A self-defending computer system.
Kurt Seifried: 03:36 Because I keep coming back to app security. For the last, say 50 years, we’ve tried this strategy of “Okay, programmers, learn to program better and write secure apps.” Judging by the number of security flaws we still have every month, it doesn’t seem to have worked.
David Brumley: 03:53 I’m kind of a heretic at CMU. At CMU, we very much have the party line that we need a better programming language and you need to prove the program is safe. That’s the way you go about security. I’m just like, “No.” No one ever is going to do that. Maybe people who do nuclear weapons hopefully, but not the average programmer.
Josh Bressers: 04:09 It’s also worth pointing out that CMU is where CERT, the Computer Emergency Response Team, came from, right? That was the inception?
David Brumley: 04:20 Yeah. From the very first Robert Morris forum, CMU served as the coordination center for all major security events.
Josh Bressers: 04:26 Right. So, that’s the nexus point of security. If you’re telling people they can’t fix it, I bet that goes over real good at the Christmas party, right?
David Brumley: 04:36 Yeah. It’s kind of a weird viewpoint. Because when you’re a professor or an academic you want to prove things, because you get comfort in proof. But the real world is messy. What you really want to do is beat attackers. You just don’t want to get hacked. You don’t want to go through the effort of proving your program is correct. CGC set it up right. There was no like, “Yes, you’re secure. At the end, I’m going to bless you, put the holy water on you, and you’re a perfect program.” It was, “Did you survive?”
Kurt Seifried: 05:06 What strategies did you use to get these programs to be here? For example, one thing I’ve seen that is somewhat counterintuitive is the ability to change the program and to make rapid updates to it. In my opinion, that is actually one of the number one aspects that allows it to be secure now. To put it bluntly, I’ve seen a lot of software, where it’s not that they can’t ship security updates. It’s that they can’t ship updates at all, in any reasonable amount of time. It just doesn’t matter how secure the program is because, at some point, they’ll find a problem and then you’ve got to wait six months to a year for an update.
David Brumley: 05:37 I agree. A lot of programs aren’t even — I call it — ‘testable’. I’ve gone to companies who want to use CGC tech and they’re like, “Hey, can you protect this?” I’m like, “How do you test it?” They’re like, “Well, we put it on a vehicle and we fly the vehicle.” I’m like, “I don’t think our technology is right for you. I think you need to be able to test that program in a regression test before we even touch it. Because I don’t want to, you know, die.”
Kurt Seifried: 06:01 I take for granted the idea of unit tests and smoke tests, but the reality is a lot of people still don’t even have that.
David Brumley: 06:09 Most people don’t have it. One of the things we’re doing when we go to market is look at modern dev shops. Not just because we think they need it the most. Often, they don’t. They have the best security. But they’re the most able to absorb new technology. Because, if you go into one of these shops that don’t have even test cases, you have to solve that problem before you can get started.
Josh Bressers: 06:32 Let’s take a step back for a minute, because we dove into this one head first. Why don’t you explain what happened at the Grand Challenge, and what you’re doing today, David? Because it’s a fascinating story.
David Brumley: 06:48 Let me set the stage for the Cyber Grand Challenge. It was motivated by a DARPA PM, program manager, who had hacked a Capture the Flag contest. At DEFCON, every year, elite hackers get together and battle it out to see who’s the best hacking team. What the program manager asked himself is, “Can we teach computers to do that?” So, he got money to run this contest, build it up, and do that. What we built, as well as other competitors — there were seven in the final round — was a machine.
David Brumley: 07:17 What the machine did is, it ingested program binaries, meaning we didn’t have source code. We just had the compiled executable. Our goal is to find flaws and then prevent those programs from being exploited. There’s three key technologies. One component every team employed was fuzzing or symbolic execution. What they didn’t do is run a SAST tool. They didn’t run Checkmarx or Coverity to find new vulnerabilities. They use dynamic testing, because they wanted something that they could prove there was a problem.
David Brumley: 07:46 Then, they fixed the vulnerability. They would rewrite the binary to do that. That’s a hard problem, but we don’t need to go into depth here. The third thing that we had to do, once we fixed the problem, was automatically create a regression test and decide whether it was worth fielding that fix. This was the interesting strategy component. One of the reasons we did so well is, if we created a fix that had 500% performance overhead and we didn’t think anyone was going to exploit the vulnerability, or the machine didn’t think anyone’s going to exploit the vulnerability, we just wouldn’t field the patch.
David Brumley: 08:17 That really reflects reality. Even if you have a vulnerability and a patch, you make a business decision on whether to field it, based upon data. On, “What is the functionality? What is the performance overhead? What is a security impact?” So, it was really those three components: finding bugs, proving they’re exploitable, being able to patch them and assess the impact of that patch. Then, a decision-making engine.
Josh Bressers: 08:38 This was all done by your system, right? Humans didn’t make these decisions, right?
David Brumley: 08:46 Humans made zero decisions. We had to install our system the night before. Then, what happened? Cyber Grand Challenge was held at DEFCON 2016, so we’re in front of 30,000 people. DARPA flips on the power switch and it boots up the seven competitors. Then, there’s a couple other machines that are doing things like scoring. The way this works is, the machines are trying to attack each other. They have a round-based system, where every round, you get a chance to attack and defend.
David Brumley: 09:10 The reason they did that, is they wanted to say, “Okay, you have a chance to field the defense, let’s see if that defense worked.” You’d field your defense and then it would give people a chance to re-attack and see if they could get persistence. At the end of a round, DARPA would give everyone’s patches to every other player. There was no secrecy, as far as how you were fixing address.
Josh Bressers: 09:27 Oh, nice.
David Brumley: 09:28 So people could just steal stuff.
Kurt Seifried: 09:30 Your system also had the ability to take a binary patch? Essentially examine it and figure out, “Okay, this is what this thing does and whether or not I want to use it.” Based on, say, performance metrics? Or, “We believe that this type of flaw will be exploited, or not.”
David Brumley: 09:46 In our system, we looked at our own patches. We made a conscious decision to not look at other people’s patches. That’s because we were being a bit evil. It turned out DARPA clarified a rule and we couldn’t do that. But, when we released patches, we would actually put a zero-day QMU in them. So, if our competitors tried to run it under QMU, we would own their system and could turn it off. We were an un-trusted patch.
Kurt Seifried: 10:11 Well, yeah, that was my first thought.
David Brumley: 10:14 It was clarified you weren’t supposed to do that, so we removed that the last day before the contest. Because of that, we never designed a system that spent time on other people’s patches.
Kurt Seifried: 10:23 What classes of flaws were you able to automatedly find? Was it just buffer overflows or was it more like logic bugs?
David Brumley: 10:32 For most of the Cyber Grand Challenge, it was memory safety. There were invalid reads and things where you could leak secrets. “Could you leak a crypto secret?” was one set. Then, they would divide them into reading a single byte, or reading from anywhere in memory and then anything that would result in control flow hijack. They we’re looking at the bugs that they cared about most. Really most prevalent in C, C++ and non-safe languages.
Kurt Seifried: 10:57 Basically, the ones we read about in the news?
David Brumley: 11:00 The ones you read about in the news. The ones that are on the airplanes, the destroyers, etc.
Josh Bressers: 11:03 That’s cool. Let’s continue this story, because I’m loving it. So, the challenge happens, you guys win, obviously, right?
David Brumley: 11:13 Yep.
Josh Bressers: 11:14 So, what now?
David Brumley: 11:15 Well, that was actually a time for reflection. We were like, “Yay, we won. DARPA spent $60 million showing the world that autonomous cyber is possible.” It turns out DARPA’s mission is to simply demonstrate the art of the possible. They’re not really someone who’s going to take that forward. We had about eight months, where we were wandering in the wind as a company thinking, “What should we be doing?” Finally, another organization in the DoD, called the Defense Innovation Unit said, “You know, we really should take this technology and help use it to protect ships, planes, everything the DoD cares about.”
David Brumley: 11:49 That was the start of our commercialization. Taking the CGC check that worked in this artificial environment DARPA created and starting to port it, so it worked on real operating systems like Linux. Real programs that weren’t artificially created by DARPA. So, we’ve been on that path since about 2017.
Josh Bressers: 12:06 That’s not very long. It’s crazy to think about. This is brand-fricking-new technology.
David Brumley: 12:14 It is. It’s a brand new paradigm, as well. There’s a set of people who get it, they see the CGC and they’re like, “That’s the vision I want to have. I want to have something that can automatically detect a flaw, fix it, field that patch and do it in under a minute.” As opposed to the current process, where it takes hours, years or even decades in the DoD. Literally. Those ships are out at sea for a long time. They don’t get software updates while they’re out at sea. That’s the vision and that’s what we’re trying to bring, but incrementally, piece by piece. But, there is quite a bit of education we have to do, because we’re creating a new market category. “Why not just use static analysis?” is a common question we have.
Kurt Seifried: 12:51 Oh dear.
Josh Bressers: 12:54 No one who asks that has ever actually used static analysis, I think.
David Brumley: 12:58 I thought that too. What I’ve seen is interesting. Often, in practice, you have the security people who are like, “We know our stuff needs to be more secure.” But, then, you have the developer, who’s like, “My job is to push out software features and I’m already checking the box on a tool to check security.” There’s this interesting question of incentives. What incentive does a developer have to do anything else?
Kurt Seifried: 13:21 Me and Josh have discussed this at length in the past. Developers get yelled at when they don’t get their features in. But, security flaws, whatever. That’s a thing that happens.
David Brumley: 13:30 It’s an interesting problem. One of the things that we hope will happen as a side-effect of our analysis — we’re using techniques like fuzzing and symbolic execution — is we’re automatically building a regression test suite. We used this in the Cyber Grand Challenge. That’s how we assessed patches. We’re hoping that we won’t just be another security tool. We’ll help them build the regression test, which is something we know they don’t like to do and can help automate. But, that’s still up in the air.
Josh Bressers: 13:57 My brain’s starting to, there’s some wheels spinning, but there’s kind of two pieces to this. There’s the offensive aspect of it, where you’re actually looking for problems, but then there’s also that defensive view of, “Let’s write tests and make sure this is doing what we expect it to do.”
David Brumley: 13:57 My brain’s starting to, there’s some wheels spinning, but there’s kind of two pieces to this. There’s the offensive aspect of it, where you’re actually looking for problems, but then there’s also that defensive view of, “Let’s write tests and make sure this is doing what we expect it to do.”nto the vulnerabilities.
Kurt Seifried: 14:32 The problem I’ve run into there, back when I was doing CV assignments, I’d literally have people send me like, “Here’s a hundred fuzzing test cases that crash the program.” I’m like, “That’s absolutely fantastic. Now, you need to go do your homework and sort that out. Figure out what the root cause is, because I’m not.”
David Brumley: 14:47 Root cause analysis is still a challenge and I think you’ve got at something more than that, which is also, you want to assess the severity. If you just fuzz and it crashes, you need to understand what that means. In the Cyber Grand Challenge, what they had us do is create what they called a proof of vulnerability.
David Brumley: 15:03 They used that term, because they didn’t want to call it an exploit. It’s bad PR to say, “The U.S. is funding exploit development with university researchers.” What they were trying to do is walk this line, of saying, “I can prove to anyone reasonable in security, that I can control the computer, while not going so far as to have to worry about weaponizing it.” In the CGC, just for the technically-minded, we had to show that we can control the CPU and get it to execute arbitrary instructions, but it didn’t matter what instructions we got it to execute.
Josh Bressers: 15:31 When you were attacking the other competitors, you weren’t literally pwning them, you’re just showing that, “I found a bug that could do something bad.” But, you didn’t have to actually do the “something bad”?
David Brumley: 15:42 Yeah, we had to show that we could gain control of their CPU, but we didn’t have to execute random code.
Josh Bressers: 15:47 Got you.
David Brumley: 15:49 Part of that is just the metric of the game. As soon as you allow arbitrary code execution, then, for example, one of the things we could do with our QMU exploit, is we could turn off their machine. We could wipe out their machine. It would not be in the spirit of the game, the backdoor battle, full no-holds-barred is better left for DEFCON.
Kurt Seifried: 16:07 Right. That sounds like…What is the robot? The Bot Wars show that used to be on? Where one could totally destroy all the other?
David Brumley: 16:15 [crosstalk 00:16:15] It’d be fun to do that. To totally destroy the other person. I think root cause analysis is an important part. As you said, being able to triage. Also, providing people with more information. One of the cool things in the product that we have is we also give a person a command line on how to reproduce.
Josh Bressers: 16:34 Oh, very cool.
David Brumley: 16:35 Kind of a silly thing.
Kurt Seifried: 16:36 No, it’s not a silly thing.
Josh Bressers: 16:36 It’s not silly at all.
Kurt Seifried: 16:37 No. Working at product security at Red Hat, how much time did we have people spend trying to reproduce these flaws?
Josh Bressers: 16:46 Oh, my goodness. It was a huge part of the job.
Kurt Seifried: 16:49 Right. So, no. That is such a critical component.
David Brumley: 16:53 I’m glad you’re saying so. To me, it’s huge, because I’m like, “If I’m going to debug, a test case is so much better than someone just telling me, “There’s a problem in this code.” Because then, I have to figure out what the code does.
Kurt Seifried: 17:02 Definitely.
David Brumley: 17:04 That’s what we’re trying to do for it. I’ve got to say, there’s some limitations. As we said, DARPA was about memory safety. That’s the first generation of the tool. It’s mostly C, C++ and looking for memory safety vulnerabilities. You can also find assertion violation. So, you may be aware of Java property testing. It’s heavily used in Elasticsearch. The idea there is, you’re not just looking for security bugs. You’re going to write and assert. You’re going to assert that your program should have a property like, “This should never be null.” Or, “You should never have a SQL query of this form.” Then, the goal of the fuzzer and symbolic execution, is try to violate that property.
Josh Bressers: 17:44 I always love the comments in C code, where you’d have the case statement and it says, “You should never get here.” But, we did.
Kurt Seifried: 17:53 Yeah. That paradigm of assert false, like, “This should never happen.” Just assert false.
Josh Bressers: 17:58 Yes. That’s exactly what assert us for. Then, you get there and it’s like, “I don’t know how we did this.” I’m familiar with that.
David Brumley: 18:06 We have to run the program. We have to prove there’s a vulnerability. We have to show how to reproduce it, so we’re asking people to do a little bit more when they want to use Mayhem, than just run a spell check over their program, like static analysis.
Josh Bressers: 18:19 Right. That’s okay though. Because I have been dealing with static analysis for longer than I can remember now, and I’ve yet to come across a single static analysis report that I looked at and said, “This is a really good report.” At best, it’s like, “Well, it doesn’t totally suck.” That’s about the best I’ve seen so far. I get it though. It’s hard, right? Static analysis is really hard and it sounds like you’re skipping over the really, really hard problem and moving towards something that’s just sort of hard, right? Which is kind of binary analysis and just beating the crap out of this stuff in, well, it’s dynamic analysis essentially, right?
David Brumley: 18:56 Yeah. It’s a type of dynamic analysis. I try to be careful, because I read the Gartner reports and then, as an academic, I’m not really sure what they’re saying. They say there’s something DAST and DAST is not what we do.
Kurt Seifried: 19:06 Sorry, what’s DAST?
David Brumley: 19:07 Dynamic application security testing.
Josh Bressers: 19:09 There’s also SAST, which is static application security testing.
David Brumley: 19:13 Then, there’s IAST, which, as far as I can tell, is an LD_PRELOAD. So yeah, what we’re doing is dynamic. We’re running the program, and we’re learning every time we execute the program. Fundamentally with both fuzzing and symbolic execution, you guess an input, you run the program, and you watch how it executes. You learn from that, to generate another input.
Kurt Seifried: 19:31 I can’t help but wonder if part of this is tied to, I guess the word I would use is intent. If you have a binary and you have a test case that allows you to control code execution, then you know that you have a security flaw. There’s no question. Whereas with, static analysis, well, we have a flaw and it may or may not affect the flow of the program and control of the program. You haven’t actually tried it out. You haven’t really confirmed it. I saw a lot with these static analysis tools is, “Okay, there’s definitely something not right here, but what kind of not right is it?”
David Brumley: 20:09 I agree with it you, and I think this is just my personal, heretical opinion, but the incentive structure, when you build a static analysis tool is, the salesman has to come in and be able to run on any code base and point out at least one flaw, right?
Kurt Seifried: 20:23 Right.
David Brumley: 20:23 That is just the business thing you have to do. There’s an incentive to remove false positives, but there’s also an incentive to always find a problem. I think, when you look at our techniques, it’s quite different. Our incentive is simply, “Can we generate tests for the program?” I feel like that’s a much saner metric.
Josh Bressers: 20:44 I guess the thing coming to mind now, one that I would imagine a number of audience members are wondering, is we’ve talked about fuzzing quite a lot. How are you different from fuzzing, with what you’re doing?
David Brumley: 20:57 Well, that’s actually a good question. We are big fans of fuzzing and a lot of what we’re doing is making fuzzing faster. So, that’s one side of it. For example, if you take AFL and you run AFL.
Josh Bressers: 21:09 Which is?
David Brumley: 21:10 Advanced. Oh man, anyone who fuzzes needs to know AFL. It’s American Fuzzy Lop, which I don’t think describes it any better. It’s the standard fuzzer that people would use, that’s open source. It’s well-known, it’s well-documented, and it has a very long list of successful uses. It’s easy to get started. What you ended up doing with AFL, is you end up having to compile your program and make sure it reads from a file. Then, what AFL does, is it fills in the file and it tries to learn from executions through instrumentation, how to generate files that get better and better coverage. Then, it’s just been found to find lots of bugs and security vulnerabilities this way.
David Brumley: 21:45 That’s what AFL does and it’s a great product. By all means, if you’re looking for great open source fuzzers, go download AFL. So, what are the things that we do? So, first, AFL can be a little bit hard to use. For example, if I have a patchier Nginx, it’s not reading from a file. It’s writing from a network socket. A lot of our work has actually been trying to bring down the expertise needed to get started. We’ll take in Nginx and we’ll take an Apache, as is, and we’ll be able to fuzz that network port.
Josh Bressers: 22:15 Nice.
David Brumley: 22:15 These are doing nothing, with just the normal build. I think another thing to consider when you look at AFL is, it’s forking in executive process, right? It’s launching a new process, every time it reads an input. If you look at it, it can only do that up to 16 cores, before you start getting a performance degradation. On the one hand, we know the more we run it, the better results we get. But, on the other hand, you have this bottleneck. We’ve spent a lot of time trying to remove that bottleneck and got it so things are much, much faster.
Josh Bressers: 22:43 Nice.
David Brumley: 22:44 That’s kind of the side of it. The second part of it, though, is we don’t just use fuzzing. In the Cyber Grand Challenge, we used a portfolio of techniques. What I meant by that is, we use a couple of different things all running at once, that are feeding off each other. One of the big ones where we got patents was called symbolic execution.
David Brumley: 22:59 Symbolic execution, what it does is, as you watch the program execute on an input, it builds up a mathematical formula of what’s necessary to take that path, like model checking. Then, it tries to come up with an input that would violate that mathematical model. So, we’ve turned reasoning about a program into reasoning about mathematics and we take advantage of things like SMT and SAT solvers. This is what all my work as a professor at CMU had been, is, “How do we make this efficient? How do we make this good?”
Josh Bressers: 23:25 Nice.
David Brumley: 23:26 We ran both together. So, you kind of have, on the one hand, fuzzers, which just quickly execute a program again and again, with heuristics. And then symbolic executors, which are more like the turtle in this race, where they’re slow, but they’re very methodical. They make sure they never duplicate state space. We use both together and then, lots of tricky optimizations. Everyone kind of went in with these two techniques and then, the question is, “How well can you execute with these two techniques?”
Kurt Seifried: 23:51 Do you get a hundred percent coverage of all the code paths?
David Brumley: 23:54 No. No, we don’t. This is a great question. No. One of the things that was really frustrating for me is, I ran symbolic execution and fuzzing on the program called Clear. Clear is the dumbest program you’ll ever see. It simply clears your screen. We were getting 33% coverage and I was like, “Why are we only get 33% coverage?” Then, I’d run it on PEARL and I was getting 67% coverage. This is just running at five minutes, right? I was completely mystified. “What’s going on here?” The reason is, we’re testing a particular input source and Clear has different code paths that are only executed when you vary the environment variables.
Kurt Seifried: 24:28 Yeah, I was just checking and it doesn’t take any command line options. So I was like, “Yeah, wait. What?” Why not a hundred?”
David Brumley: 24:35 Yep. So we would vary one environment variable, not another. Because, it’s more of a testing paradigm, right? You’re taking one thing and you’re iterating and trying to explore all the combinations. Going to the next thing, trying to explore all the combinations. What you’re trying to do is maximize code coverage and if someone hands us code coverage, we can, every case I’ve seen, improve it. But, saying 100%? I mean, you have to realize there may be dead code, unreachable code, code that is outside your configuration. Just like Clear.
David Brumley: 25:02 This has also been one of the hard parts when you go to market. Static analysis “looks at all the code and therefore, it can find all security vulnerabilities”. I just don’t believe it. I would rather have something that looks at less code and is actionable, has zero false positives, than something that purports to look at all the code, but has a huge amount.
Kurt Seifried: 25:20 What you just said about the zero false positives is huge, because me and Josh being on the receiving end of these reports and the false positive rate was anywhere from very low to 99.9%. I literally received reports where people were like, “This product doesn’t store credit card information correctly.” I’m like, “Wait, what? We take credit cards to this product now? Wait, that doesn’t sound right. What?” You know? They literally were just cutting and pasting reports from e-commerce sites or something? That false positive rate..I mean, Josh, how many times did we spend hours or days chasing phantoms?
Josh Bressers: 25:57 I do it every day, where I get a phone book of a report and then I have to refute some or all of it. Hearing ‘no false positives’? That is the dream.
David Brumley: 26:12 That’s what these techniques delivered. Zero false positives, it’s always reproducible. But, the very first hacker was this guy named Turing, right? He was trying to break the German code.
Josh Bressers: 26:21 I’ve heard of that guy before.
David Brumley: 26:23 Yeah. Isn’t it interesting how the first computer scientist is a hacker?
Josh Bressers: 26:27 Right, that’s true.
David Brumley: 26:31 We go into computer science courses and we all hear about undecidability and how you can’t even decide whether a program terminates or not, right? If we remember this from CS-101?
Josh Bressers: 26:40 Yeah, the stomping, the halting problem, right?
David Brumley: 27:07 Yeah, that’s true.this in our world of programs and trying to check them for vulnerabilities is, you can either have false positives and check everything, or you can have no false positives and not check everything. I think the world was fooled by these initial products. They came out actually as bug detectors and code quality detectors like Coverity and then got relabeled as security tools when the security budget suddenly materialized.
Josh Bressers: 27:07 Yeah, that’s true.
David Brumley: 27:07 Those tools, their entire life was trying to prove the absence of something, when, what we’re trying to do, in security, is prove the presence of something. We want to prove, “Is it a problem?” Again, very philosophical, but you’ve got to think of it. Are you trying to prove the absence or the presence of something?
Josh Bressers: 27:22 Right. Well, no, I don’t think it’s philosophical at all. I think that’s exactly the right way to look at this. You can’t prove a negative, as they say. You’re right. That’s exactly what they’re doing. Cool. So, let’s do this, David, we’re approaching the end of our time together. I’ll give you the floor, for a few minutes.
David Brumley: 27:39 Yeah. Now I’m a tenured computer science professor at CMU. That’s a pretty nice gig, right?
Josh Bressers: 27:45 Right.
David Brumley: 27:46 I have a lifetime appointment. I’m moving out to do a startup, giving up on that, because I think this idea that we can check the world’s software for exploitable bugs, that we can make it actionable, is something that the world needs.
Josh Bressers: 27:58 Totally.
David Brumley: 27:59 So, we’re building a product for this. As you said, there’s tools like AFL, that are open source, that are also along these lines. It’s good when there’s a community. If I was the only one saying it, I’d be the crazy nut. But, I think we’re seeing enough people out there who want tools that provide actionable results that can test programs, check the code that’s actually going to execute the thing that’s actually going to run on your machine. I think the full promise of these techniques, in the long-term, is you can check compiled code, which means I no longer need developer participation to check it.
David Brumley: 28:26 I think today, in the products that are shipping, including ours, they tend to do best when you have developer participation. Just from a usability perspective. But, I do want to get to that world, where we can be fully autonomous, but also, where I can check other people’s code for security vulnerabilities. I’ve been able to download code from various vendors, never had access to the developer and find new zero days. I think that’s the world we want, not because it’s offensive, but now, I can hold those people responsible for their code quality.
David Brumley: 28:52 That’s what our company is trying to get to, is that world where it’s fully autonomous. We can find problems, fix them, test them, give you an accurate view of the performance impact, but also get to this world where deep software testing is the norm. Everyone can check everyone else and that’s how we know everyone’s playing it safe.
Josh Bressers: 29:09 That’s awesome. I cannot wait for that future, because I think, today, if you look at the state of things, it doesn’t look so bright. I hope it works out, man. I guess if people want to get a hold of you guys, where should we send them?
David Brumley: 29:25 Go to our website, forallsecure.com. F-O-R-A-L-L-S-E-C-U-R-E .com.
Josh Bressers: 29:31 Awesome.
David Brumley: 29:31 We chose that name because it reflects what I just said. We think security is for everyone and we’re kind of math geeks, so the ‘for all’ symbol, the universal quantifier is our logo.
Josh Bressers: 29:40 Nice. That’s very cool. All right. I’ll put links to all this in the show notes. Yeah, I guess. Awesome. Thank you so much, David. Thank you, Kurt. This has been a fantastic episode. You can go to opensourcesecuritypodcast.com, hit up the show notes. You can use the #OSS podcast hashtag to find us on social media. Yeah, Kurt and David, have fabulous rest of your days.
Kurt Seifried: 29:59 Thanks everybody.
David Brumley: 30:00 Thank you.
Kurt Seifried: 30:01 Awesome. Thanks everyone. Bye-Bye.
*** This is a Security Bloggers Network syndicated blog from ForAllSecure Blog authored by Tamulyn Takakura. Read the original post at: https://blog.forallsecure.com/open-source-security-podcast-ep.-151-the-darpa-cyber-grand-challenge-with-david-brumley