August 25, 2021

SE Radio 474: Paul Butcher on Fuzz Testing

Paul Butcher of AdaCore discusses Fuzz Testing, an automated testing technique used to find security vulnerabilities and other software flaws. Host Philip Winston spoke with Butcher about positive and negative testing, how fuzz testing fits into the life-cycle of software development, brute-force and blunt-force fuzz testing, the popular open-source American Fuzzy Lop fuzzer from Google, and how fuzz testing works particularly well with the Ada programming language.

This episode sponsored by Conf42 and NetApp.

Show Notes

Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Philip Winston 00:00:52 This is Philip Winston for Software Engineering Radio. My guest today is Paul Butcher. Paul is a Senior Software Engineer at AdaCore and their Lead Engineer in the UK for their HICLASS initiative. Prior to joining AdaCore, Paul was a Consultant Software Engineer for 10 years, working for aerospace companies, such as Leonardo Helicopters, BAE Systems, and Thales UK. Before becoming a consultant, Paul worked on the Typhoon platform and safety-critical software for military UAVs. Welcome Paul.

Paul Butcher 00:01:23 Hi Philip, welcome to you and thanks for inviting me on the show.

Philip Winston 00:01:27 So I think our listeners are familiar with a lot of different types of testing, but perhaps not fuzz testing. Can you tell me what fuzz testing is?

Paul Butcher 00:01:38 Absolutely. So, one nice way to think about what fuzz testing is, is to consider how it differs to standard verification testing. So, if we considered, we have a software application: it has it inputs, and it’s going to generate an output. Then we can test that by having a test case that has an actual input and a correlating expected output. And then we execute the test and we check that the actual output matched the expected output. Fuzz testing differs in that we’re not so interested in the output of the application. We’re more interested in the behavior and the behavior of the software application as we subject it to a number of different inputs that it may not even be used to dealing with.

Philip Winston 00:02:28 So is that considered negative testing? I read about positive and negative testing. I wondered how that related to fuzz.

Paul Butcher 00:02:35 Yeah, it is related so positive testing. I consider that to be more verification based. So, we have some requirements that our system has to satisfy and we can build up evidence to argue that we’ve met those requirements through positive testing mechanisms. I guess a way of considering this would be: we have a secure room that we want to test where the entrance mechanism to this room works correctly and a positive WorkWave testing. This will be to take all of the authorized personnel and have them line up and let’s see if they can indeed open this door and go through it. The flip side of that to look at it from a negative testing aspect would be to take every single human being that isn’t authorized to enter this room, have them line up and let’s observe that they can’t get in. And here lies in the problem of negative testing. You can immediately see that the number of potential test cases can be very large, if not indefinitely large.

Philip Winston 00:03:43 Okay, that makes sense. And I’m going to list a few practices that I discovered in my research in. We can go through them one by one to see how they connect to fuzz testing or if they do. So, I think the four practices I saw are safety engineering, security engineering, robustness engineering, and vulnerability testing, and feel free to lump these into categories or divide them up differently. But let’s start with safety engineering. What does that mean to you?

Paul Butcher 00:04:12 For me, safety engineering is producing. If we’re talking in the software sense, is producing a software system. That is responsible for a critical aspect of the application that, should that aspect fail, there will be a high chance of a loss of life. And you know, this is very prominent in the aerospace industry. A safety engineering would play a key role in the flight management system. The avionics of the aircraft. And first testing can play a role here, but it’s more traditionally associated with security testing, but of course the disciplines do overlap. So, if you have a vulnerability within your system, that can be exploited and that can lead on to a safety implication, it could trigger a sequence of events that could lead to a safety hazard. Then first testing could identify that vulnerability.

Philip Winston 00:05:19 So you touched on security engineering. My feeling there is that’s more in line with an actual attempt to subvert the software and attacker is that what security means in that context?

Paul Butcher 00:05:33 We tend to think of our software systems as if they’re security critical. They need to be free of exploitable vulnerabilities.

Philip Winston 00:05:44 And that ties into vulnerability testing. So, I guess these are sort of overlapping categories, but I was interested in your definition of safety. That’s a more dire sense of safety than I’ve dealt with in my own applications. So how about a robustness engineering? That was the fourth one on the list?

Paul Butcher 00:06:00 I think it depends on the certification you’re trying to achieve with your software system. My experience tends to come with defense and aerospace and we have guidelines like gear one seven ATC, which is a guideline for producing safe software systems. Robustness engineering does come under tier one seven AC, but it’s more of a requirements-based activity and it’s perhaps not suitable to satisfy those objectives with first testing. So, I would say first testing is more for vulnerability identification. And once you’ve identified a vulnerability in your system, depending on where you are in your software development life cycle, you may be able to correct that vulnerability, but if you can’t and you’ve identified it, then you’ll need to put security measures in place. And first testing can also play a role in determining whether those security measures are sufficient for protecting your security assets.

Philip Winston 00:07:07 So you touched upon something I was wondering, which is if fuzz testing finds a crash in an application it’s able to produce a crash, is determining whether that crash is exploitable, whether it’s a real vulnerability. Is that part of the first testing, or do you sort of hand that off to a second type of testing, which digs into these issues that you found?

Paul Butcher 00:07:29 So I said, those are great question. And I think it’s dependent on the particular capability. There are commercial first testing tools out there that have that mechanism. They can identify vulnerabilities and then produce exploitations and try and see dynamically whether that vulnerability is exploitable. That’s quite advanced first testing. Typically, it’s a mechanism for identifying bugs within your system, regardless of whether those books can be exploited or not. And I would argue that most software bugs should be fixed. There needs to be a very good reason for having software bugs within your system. So, there’s a benefit in identifying them regardless of whether they’re exploitable or not.

Philip Winston 00:08:16 I can see that. And you mentioned the software development life cycle. When in your projects, have you employed fuzz testing? Is it early or late in the process?

Paul Butcher 00:08:27 We’re looking at it from more of a developer’s perspective. And from my experience, it tends to come into play much later in the software development life cycle. And I’ve spoken to companies that have fuzz testing teams that will build up the first testing campaigns, and there’ll be given a software system prior to deployment and they will do that security analysis at that point. Like all testing the earlier you can get it into the development life cycle, the more benefit you’re going to get from the capability.

Philip Winston 00:09:03 So maybe before we dive into the details of fuzz testing, you could just relate a anecdote or a story about a time that first testing and found a surprising vulnerability. Or I know you may not be able to give us all the details, but something that surprised you that you wouldn’t have been able to sort of discover on your own.

Paul Butcher 00:09:23 Yeah, absolutely. I first came into contact with this technology. It was about three years ago during my interview process where they to call the company that I’m with now. And I was asked if what my understanding of first testing was, and I didn’t have a huge amount. So, I was given the task of going away and investigating first testing with the aim of presenting back my research as part of the interview process. And whilst doing that, I took a software application that I found on an open source piece of software on the web that read an XML file. It’s verified that XML against an XML schema did some processing on the data and produced some output. After setting up the first testing harness and executing the first test, I left it running for three days and it came back with a test case that was crushing the system.

Paul Butcher 00:10:24 And on further inspection, it was producing a valid XML file where one of the data elements should have been identified by the schema as being invalid data. And once I started investigating further, it turned out that it was a fault within a regular expression library. And this was interesting to me because this is kind of the journey that first testing can often take you on, isn’t the one you expect to go on rather than find a bug within the actual software system. It was a bug within a secondary library that the software system that was using.

Philip Winston 00:11:04 Okay, building off that, can you walk us through a typical first testing scenario from the start in terms of the developers experience, what tools do they specifically use or could use and how much time is spent setting things up versus actually running the testing and maybe a final thing there is, do you use local testing resources or is this something can be done in the cloud or needs to be done in the cloud? Based on the amount of computation needed?

Paul Butcher 00:11:32 There is always effort involved in setting up a first testing campaign. I wouldn’t say it’s proportionate to the amount of time you then need to spend leaving the test executing the actual measurements of time are going to be based on the complexity of the system you’re testing and indeed the complexity of the input data that that software application receives. But a typical scenario for a developer who wants to first test some software would be, they would choose a first testing tool, a typical open source first testing tool, that’s is I would say, is regarded as the defacto first testing capability out there is American Fuzzy Lop or AFL, which is, uh, a fuzzing engine that’s come out of Google labs. The software developer would need to identify the test injection point. So, this is a definition of the input data into this system needs to be understood such that the, the developer can build a starting corpus. And what we mean here is that we need to provide the first testing engine with an initial set of test cases that it can then work on and mutates such that it can produce more test cases and inject them into the system whilst observing the behavior of the system and detecting if there any crashes or hung processes have occurred.

Philip Winston 00:13:03 So you mentioned mutation and I read a little bit about a mutation engine. Is that what AFL is? Is it generating these test cases by mutation?

Paul Butcher 00:13:13 AFL is a really good example of a mutation based test case generation strategy. So, mutation algorithms differ from a much simpler form of first testing that is commonly known as brute force or black box testing or even blunt force. And this is where you’re just producing random inputs to fire into your software system. And typically, a large proportion of those randomly generated test cases are going to be syntactically invalid and your software system is going to throw them away of the boundary. And that’s not to say there isn’t a benefit in testing the interface of a system and in a brute force way. But what gets more interesting is when we introduced mutation based strategies, and this allows us to get much deeper into the control flow of the software application.

Philip Winston 00:14:14 So one thing I was thinking about with this brute force testing is it’s really not possible to try all possible inputs. If the input to your function or your network packet is even 164 bit integer there’s trillions upon trillions of possible values. So is brute force necessarily testing only a small subset cause the name you might think it’s testing all possible inputs.

Paul Butcher 00:14:39 It’s a really good observation that it’s not a great name. And that’s typically why I tend to prefer thinking of it more of as blunt-force you’re absolutely right. You very very quickly start riding the steep incline of an exponential curve when you have complex data inputs. And it doesn’t take much for you to, to start getting to a point where even with something like quantum computing and a huge amount of resources and time, you still won’t get anywhere near the number of test case per mutations, I guess a good analogy of this is the infinite monkey theory, where if you take an infinite number of monkeys, give them all the typewriter. Eventually they will produce the full works of Shakespeare. But in reality, even if you filled the entire observable universe with monkeys and typewriters, you probably not going to get Hamlet out at any, any point in the near future. And perhaps a better way of doing it will be to give the monkeys a library of texts. And let’s see if they can produce some interesting novels by taking those texts and swapping words around them. This is the direct benefit of mutation algorithms.

Philip Winston 00:15:57 Okay. So, let’s go back to the example of the developer, performing fuzz testing. Once everything is set up and say, you do find a crash and then the other developers, or you iterate to mitigate that crash. Is it pretty easy to re-run the first testing or do you have to kind of start over setting things up?

Paul Butcher 00:16:18 Ordinarily it is easy. If you’re using an approach like AFL, it will build up a set of tasks cases. That’s it considers interesting as it’s executing and you can pull those test cases out of its queue and they become the starting corpus for the next test run. So, you can kind of kick off where you left.

Philip Winston 00:16:41 And in terms of duration, how long in hours or days have you seen the time period that you need to execute the first testing, whether it’s AFL or some other system, how long typically do you do want these execution times to be?

Paul Butcher 00:16:56 You ultimately want to be able to identify as many vulnerabilities as you feasibly can within a short amount of space or time. That’s the goal of this. It’s going to be very much dependent on the complexity of the system and the capability of the first testing mechanism that you’re using. There are ways of speeding up that process by using additional side tools like symbolic execution or other tools that allow you to instrument the code and give you a sense of a feedback loop, such that you can understand when you’ve hit new paths through the execution.

Philip Winston 00:17:33 So I read a little bit about divergent paths and you mentioned code instrumentation. What type of instrumentation are we doing here? Is it a question of adding calls to a library or marking up the code? How invasive is this instrumentation?

Paul Butcher 00:17:49 The way AFL does this? And this is where we’re typically talking about smart gray box, first testing. And just to break that term down, it’s gray box in the sense that we have an understanding, we need to go inside the application and add some instrumentation. We’re not white box because we don’t understand the functional behavior of the program and we’re, we’re not black box. And what AFL will do is that during the compilation phase, there’s a compiler pass that adds instrumentation points around the basic blocks of the code. And what these instrumentation points are doing is they are writing into a shared memory area and they’re basically incrementing a value within a buffer. And as the test is executing, these values will be incremented. And this allows the fuzzing engine to understand when a new path of execution has just been found through the control flow graph of the program you’ll testing. And it will then take that test case and say, well, you made progress. You got us deeper into the, into the diverging paths within this application. I’m going to put you back on the queue and you take you further.

Philip Winston 00:19:08 That’s very interesting. So, you’re able to detect that this test case has found a new path in the code, and now you want to explore even deeper and maybe find yet another or follow on path. So, without this instrumentation is the only thing you can do with a fuzz test is check if the application has crashed or are you looking at the output of the application to see that you’ve altered it in general?

Paul Butcher 00:19:33 But here we’re talking about the anomaly detection capability of a first testing campaign. You commonly are checking for if the premise has crashed, but there’s a lot more we can actually do here. There’s always a very easy or very crude way of detecting whether the program has hung in that we can just have a timeout on the execution and say, well, if it exceeded so long, we’ll call our home process. If you start to look at the runtime exception, checking that many programming languages have, then there are additional constraints we can check as that program is executing. And typically here we can identify buffer overflows or buffer underflows that actually may not have caused the execution to crash. You could have a buffer overflow that writes into the next area of stack, but the program is now operating in an unknown and potentially dangerous state, but it’s still running and it’s still, it hasn’t crashed yet. And things like using an address sanitizer can identify that.

Philip Winston 00:20:39 Yeah, that’s interesting. I can see there’s a lot to study the results of these fuzz tests to determine the best technique going forward.

Philip Winston 00:21:27 I guess, to go back to duration. What, what would you consider a very long fuzz testing process? Are we talking about, is this something you would parallelize with multiple instances or is it, how do you scale up this testing? If you do have long durations that you need to test over,

Paul Butcher 00:21:45 If you’re looking at first testing a complex system, then you’ll, you’ll need to accept that it may take a long time and you can reduce that time by executing multiple fuzzing sessions at the same time. So, you can utilize a multi-core system and run a fuzzer on each call. Typically this will be done in a high-end server away from the developer’s workstation, maybe via continuous integration mechanism. There’s ways of invoking first tests on software applications as pull requests are submitted into, you know, get repositories and things like that. But if it’s a complicated system, you could be expected to execute four days before you see anything useful back.

Philip Winston 00:22:38 Yeah. So, in that case, it’s maybe not part of the continuous integration with every check-in, but it’s run at some other interval or it’s run sort of as, as often as it can. What is a fork server? And it sounds like that might relate to the specifics of testing a running application or scaling up the testing.

Paul Butcher 00:22:57 One of the aspects of first testing that you need to achieve is a very, very fast mutation phase and a very fast injection of that test case into the software system. And then you want that software execution to be very quick. We need to accept here that we are going to be producing a lot of test cases that may be invalid and a lot of test cases that may not be particularly interesting. So, we want to get through them as quick as we can to try and get to the real juicy, interesting stuff. So, this tends to be a mechanisms built into an operating system and allows you to spawn processes. And first testing is all about the generation of test cases and the injection of those test cases into a software application as fast as possible. And we needed to be fast because a large proportion of those test cases, maybe invalid in that they’re going to get thrown away at the boundary of the system, or there may just not be interesting and they may not be taking us to new areas of the program. So, a fork server is typically used to spawn a new process. And in this case, it’s the software application on the test. And then we will inject the test case into that process and observe the behavior of the task. There are ways of moving away from that mechanism and AFL offers up a persistent test case mode where you can run multiple tests within the same process, and this rapidly speeds up the testing, but then there are complexities involved with the retained state of the software application in between test runs.

Philip Winston 00:24:40 Let’s talk about an endeavor that I think your company is involved with and see how it relates to fuzz testing and how it maybe relates to other software practices for high availability, high critical systems. So, I came upon this acronym HICLASS, and I’ll read what it stands for. High Integrity, Complex Large Software and electronic Systems. Is this something that relates to fuzz testing, or is it a bigger endeavor than that?

Paul Butcher 00:25:08 It’s a bigger endeavor than that. It’s a UK government sponsored research and development program that has a main focus on cybersecurity within civilian aerospace. So, the goal behind HICLASSes to bring together tier one aerospace manufacturers within the UK and tool providers like AdaCore, but also universities. And the research group is split into a number of work packages that are looking at things like cyber security within the aerospace industry, as far as guidelines and standards go. And then there’s aspects of how to implement security measures for the industry, how to detect vulnerabilities, which is where things like fuzz testing come in. There’s other aspects of the research program as well, that are the people are working on.

Philip Winston 00:26:02 What does the term compiler hardening and how does that relate to these efforts?

Paul Butcher 00:26:08 A compiler hardening is, I probably should start this by saying that one of the main areas of development tools that AdaCore works in is compilers. We produce compilers for A to C and C++ across multiple platforms. Compiler hardening is a mechanism for accepting that the generated software system, that the compile software system could be subjected to hardware attacks and things like side channel attacks. And we’re looking into is whether we can add security measures at the compiler level, and that can counter measure these particular type of security attacks. An example of this is clearing the stack on a function exit, and ensuring that your valuable data can’t be read from those memory areas.

Philip Winston 00:27:06 Maybe this segues into formal methods. That’s another term that I came across related to HICLASS feeling it doesn’t directly relate to fuzz testing, but maybe it’s in the same category of, you know, extensive involve processes that we can do to guarantee critical software functions correctly. Have you come across formal methods?

Paul Butcher 00:27:27 Yeah, absolutely. And it fits within HICLASS in that we very much think of security measures as being a layered approach. We talk about defense in depth with security, we accept that one security measure may not be enough and things like common mode failures, which are all about one attack, bringing down multiple security measures. Formal methods is a way of mathematically proving that your software application is functionally correct and absent of runtime errors hands. If you can take some high critic of the high critical aspects of the application and formerly prove them to be functionally correct and bug free, and then some of the lower criticality aspects, you can first test them and gain assurance that your system is secure. You can come up with a convincing argument that your overall system is secure.

Philip Winston 00:28:30 Great. That makes sense. So, we’ve talked about fuzz testing and we talked about this HICLASS initiative. The name of your company is AdaCore. We haven’t talked about Ada. That’s a programming language I use very briefly in school, but I have not seen a lot recently. Can you tell me what domains Ada is, is still used today, or whether you consider it a language that’s still under active development, or is it in a different stage of its life cycle?

Paul Butcher 00:28:58 It’s still very much in active development and my own experience is I studied in Ada university. The first job that I had was an Ada based program working on the Eurofighter program. And I then went through a bit of a pilgrimage of working on multiple different programming languages before coming back to working on Ada again. And my personal experience is that it’s a, it’s a fantastic language, but you tend to see it in software applications that have a, either a mission, critical safety, critical or security critical need. It’s widely used within the defense industry within across Europe and the US and all over the world. The automotive sector and the rail industry nuclear space as a language, the reference manual is a, is an ISO standard. And there’s a, an ISO working group that’s are constantly working on improvements to the language to ensure it stays competitive with the other capabilities of the other existing languages out there.

Philip Winston 00:30:09 So you mentioned your company makes compilers for Ada, but also C and C++, it seems like this would give you a perspective if we tie it back to fuzz testing, how does fuzz testing strategies or capabilities differ between those three languages?

Paul Butcher 00:30:25 This is a really interesting subject for me. Ada has a very rich runtime testing capability, that’s built into the semantics of the languages defined within that ISO standard that if you’re going to write an Ada compiler, you have to have runtime constraint, checking for buffer overflows, and it’s a strongly typed language. So, if you try to, if you try to mix types in assignment codes, the runtime is going to pick up on that. And what this runtime does is that it will raise exceptions if the runtime, any of the runs on checks fails. We can capture these within the test harness offers, test harness, and we can indicate to the fuzzing application that a runtime check has failed. So, what you typically tend to do is you, you switch on as many of the runtime checks as you can before executing the first test.

Paul Butcher 00:31:21 And this is where we can not only just check for crashes, we can check for the system, having entered an unknown operating state, in addition programming languages that supports a designed by contracts capability of really interesting for first testing, if you’re first testing a C application, you could write, you could put assertions in there to say, if this happens, raise this assertion, the first test will pick that up as an anomaly, and it will give you the test case to reproduce it in Ada. You can write pre and post condition contracts on your sub programs within your application. So, a typical example is if you’ve a very simplistic example, if you have a sub program that takes two parameters and returns the summation, then your post condition would say the output has to be equal to the two inputs when they’re added together. If that contract fails, the run time check will detect that. And the first test will pick up on it. This is where we can start to move towards some aspects of fuzz testing for functional correctness as well.

Philip Winston 00:32:38 So talking about Adaís for reliability and safety and security, to what degree would you say those features are coming from the language itself? And to what degree are they sort of layers of effort that are going on top of the language, such as fuzz testing or formal methods?

Paul Butcher 00:32:56 Some aspects of the language are there to ensure that the developer writes that code in a very structured way. And there are lots of aspects of the semantics of the language that stop you from doing things that you may not have even appreciated, could be non-secure or unsafe. And typically, these are things like the strong typing mechanism, that’s Ada supports, aspects like the extensive runtime checking. But in addition to that, if you want to get to the highest level of assurance for your software system, and this is where we’re talking about mainly safety critical software. So, an example could be an emergency braking system on an automated train where it just absolutely has to work. This is where tools like SPARK play a role and SPARK is a, it’s a language in its own, right? In that it says a subset of Ada, but it’s also a static analysis capability that can analyze the code and tell you whether it’s functionally correct and absent of runtime errors.

Philip Winston 00:34:10 Okay, great. I wanted to ask about your CNC++ compilers, but I was wondering if you had to cite a drawback of development in Ada, what would you say is a challenge or a negative?

Paul Butcher 00:34:20 The language itself, my personal experience is it’s great. I love developing in Ada. The issue that we, we actively work on is building the community of Ada developers. And there is a community out there, but where other languages perhaps like C++ some Java have grown in a huge way and there’re more extensive resources available. We’ve got to put more effort into ensuring the same resources are available for Ada and they are, but it would be nice to grow that community more.

Philip Winston 00:34:57 So moving onto the CNC++ compilers, what features does a decoder add that maybe is not available in a typical compiler?

Paul Butcher 00:35:06 I would say, and I should probably caveat this with it. I’m not an expert in the CNC++ compilers. We have some quite interesting technology around converting HEDA programs to C programs and vice versa. And what this capability allows you to do is that you can, you can use the formal proof tools on the Ada version, and then you can cross, you can generate that Ada into C and then you cross compile that more on a process of that, perhaps I need to compile it. Wasn’t supported on.

Philip Winston 00:35:45 So we’ve talked about Ada and CNC++, let’s move on to, just to mention a little bit about other languages. You may or may not be personally familiar with these, but what is the role of fuzz testing in so-called safe languages like JavaScript, Python, Rust, or go where the runtime is not supposed to allow, but for overflows and overruns, is there still a role for fuzz testing in these languages?

Paul Butcher 00:36:10 There absolutely is. I would say because although the runtime doesn’t allow these things to happen, I believe what we’re talking about here is that the runtime can detect that these things has happened. And what we don’t have is the capability to detect it’s going to happen at the compile time. And I mean, you, you get some sense of this with some compilers, they may issue warnings to say, hold on, this is looking a bit dodgy, what you’re trying to do here. Or you can run static analysis that can identify these sort of issues. But if we’re relying on the runtime to catch them, then we need something like a fuzzer to trigger those events.

Philip Winston 00:36:52 Great, okay. Now let’s start wrapping up. What would you say is the future of us testing or some endeavors that you think are promising or interesting in the related to the field? Looking out, say a few years from now, where would you like to see the practice of fuzz testing?

Paul Butcher 00:37:10 So for me, the research that’s going into this area is, is incredible. It’s a very hot topic. In the future I would expect much more automation, anything that’s takes away the complexity of setting up the tests is, is definitely a step in the right direction. And then in addition, you’ve got supporting tools like symbolic execution that can, can help you explore more of that code base. And I think we touched on diverging paths and symbolic execution is a mechanism that can get you through bottlenecks in your control flow. So, an example would be an if statement that says if variable X equals some upskill floating point number, that’s quite difficult for a first test mutation algorithm to randomly stumble across that number, to be able to get through that bottleneck. Whereas symbolic execution allows you to symbolically execute a program. And then when you get to a divergence, it can use theorem solvers.

Paul Butcher 00:38:21 Microsoft said three too, to take the execution back to the starting point and calculate what the input should have been to take you down the diverging branch and these capabilities can suffer from issues like branch explosion and the amount of memory that they can consume while they’re executing. And the length of time it can take to solve the inputs, to determine the inputs that you need can be quite long, but they’re being actively worked on and I’d expect these capabilities to improve in the future as well. For me, combining fuzz testing with other techniques like symbolic execution is really going to be where the future goes.

Philip Winston 00:39:05 That sounds really promising. Is there anything we haven’t talked about that you’d like to bring up at this point?

Paul Butcher 00:39:11 There’s one aspect of fuzz testing that’s, we’re researching at the moment that it’s, I’m quite interested in, and this is moving away from fuzz testing at the system level and taking the capability and moving it more towards the hands of the developer. What we’re looking at producing here is a technique that allows the developer to get to a point where he’s his code is compiling, but it’s not ready to be identified as a full system yet, but there is aspects of it that are working, whether we can then identify functions within the program that we can first test. And we then are looking at building a tool that can produce values for the parameters of those functions from the, the first test mutation engine. And this opens up quite a lot of potential for the technology, because you can scope the tests to whatever level you want. You don’t have to be doing at the system level. You can tighten it out to a subsystem or even a very small sub-program.

Philip Winston 00:40:21 Where can listeners find out more about you, AdaCore or fuzz testing in general? And I’ll put the links in the show notes.

Paul Butcher 00:40:28 Weíve got a few of blog posts that we’ve produced about the fuzz testing work we’re doing. There’s the blog post is AdaCore, if you do a search for AdaCore blogs, there’s a whole host of blogs up there that our engineers produce on a regular basis. The company website is an excellent way to find out more about what we’re doing.

Philip Winston 00:40:49 Great. I’ll look up those URLs and put them in the show notes. So, let me mention three previous shows that might be related. There is 453 Aaron Rinehart on Security Chaos Engineering, 390 Sam Procter on Security in Software Design and 309, Zane Lackey on Application Security. Okay, well I think that wraps it up all. Thanks a lot for your insights here. And it was nice to have you on the show.

Paul Butcher 00:41:16 Oh, thank you Philip. It was an absolute pleasure and to thank you for taking the time to invite me on the show.

Philip Winston 00:41:22 This is Philip Winston for Software Engineering Radio.

[End of Audio]

SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

Join the discussion

You must be logged in to post a comment.

SE Radio 474: Paul Butcher on Fuzz Testing

Show Notes

Related Links

AdaCore

Wikipedia

SE Radio

IEEE Articles

Transcript

Join the discussion

More from this show

SE Radio 725: Danny Yang and Sam Goldman on the Pyrefly Type Checker

SE Radio 724: Jure Leskovec on Relational Graph and Foundational Models

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

Menu

Recent posts

Search

Search

SE Radio 474: Paul Butcher on Fuzz Testing

Show Notes

Related Links

AdaCore

Wikipedia

SE Radio

IEEE Articles

Transcript

Join the discussion

More from this show

SE Radio 725: Danny Yang and Sam Goldman on the Pyrefly Type Checker

SE Radio 724: Jure Leskovec on Relational Graph and Foundational Models

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

Menu

Recent posts