Robert Seacord, the Standardization Lead at Woven by Toyota, the convenor of the C standards committee, and author of The CERT® C Coding Standard, Effective C, and Secure Coding in C and C++, speaks with SE Radio host Gavin Henry about What’s New in the C Programming Language. They start with a review of the history of C and why it has a standard, and then they discuss what C23 brings and how programmers can take advantage of it. They consider the sectors in which C is most used and whether you should use C to start a brand new project in 2025. Seacord discusses 8 new things that C23 brings, use case examples, must haves, floating point numbers, how automotive systems use C, why C is used there, Rust vs C, compile time checks vs static analysis, all the various safety standards they can use, why you should use the right tool for the job and never trust user input no matter the language.
Show Notes
Related Episodes
- SE Radio 414: Jens Gustedt on Modern C
- SE Radio 494: Robert Seacord on Avoiding Defects in C Programming
IEEE Computer Society Digital Library
- Secure Coding in C and C++: Of Strings and Integers
- Effective C
- Detecting type errors and secure coding in C/C++ applications
Other References
- Twitter – Robert C. Seacord (@[email protected]) (@RCS) on X
- SEI CERT C Coding Standard – Confluence
- Wikipedia – Robert C. Seacord
- Effective C, 2nd Edition
- Secure Coding in C and C++, 2nd Edition | InformIT
- TIOBE Index C position (4th Oct in 2024, 1st in 2021)
- ISO/IEC 9899:2024
Transcript
Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.
Gavin Henry 00:00:18 Welcome to Software Engineering Radio. I’m your host Gavin Henry. And today my guest is Robert Seacord. Robert Seacord is the Standardization Lead at Woven by Toyota and is the convener of the C Standards committee. Previous industry experience includes roles at IBM and the X Consortium. He was a researcher, Carnegie Mellon University Software Engineering Institute professor at the Carnegie Mellon School of Computer Science and Information Networking Institute. Well that’s a mouthful. And the University of Pittsburgh, his previous books include the CERT C Coding Standard, Effective C, which is now in its second edition, and the Secure Coding in C and C++. Robert, welcome to Software Engineering Radio. Is there anything I missed in your bio that you’d like to add?
Robert Seacord 00:01:06 No, that wasn’t bad. The most recent book was the second edition of Effective C published recently by No Starch Press.
Gavin Henry 00:01:12 Yeah, it’s a good one. I’ve got that. I’ve gone through a bit of it for this show, so I’m excited to dig into bits with you.
Robert Seacord 00:01:18 Cool.
Gavin Henry 00:01:19 For the listeners, the goals that we’re trying to achieve today is to have a short refresher on C to understand what the C Standards, specifically C23 brings and to explore different sectors where we might find these standards and C in general. So let’s start, I’d like to start with a brief history of the C language, what the C Standard is and what the CERT C coding standard is. Okay, are you ready?
Robert Seacord 00:01:45 Sure.
Gavin Henry 00:01:46 So when was C created? You should know this instantly.
Robert Seacord 00:01:50 Oh yeah. because I was there, I think I was seven, but it was developed in the early 1970s at Bell Labs as a system implementation language for Unix when it was initially being developed there at Bell Labs.
Gavin Henry 00:02:04 And what is a standard, what is the C Standard? How do they relate?
Robert Seacord 00:02:08 Yeah, so originally there was no C Standard. There was what we call K and RC, which was named after the authors of the C language book back in the 70ís, Karnehm and Ritchie. And in the late 70ís, folks got together and decided, ANSI in particular, decided it would be useful to standardize the language. The first standard was created by ANSI in 1989 and that’s referred to as C89. And then the next year it was published by ISO as C90. And those two standards are identical. They just have different cover sheets and there’s continued to be standards every decade or so, since it’s not a quick process and the standard can kind of be thought of, it’s sort of an instructional manual for implementers, but it’s also a contract between compiler, implementers and users of the language. So the C language is defined by the standard as opposed to a particular implementation of that standard.
Gavin Henry 00:03:12 And that’s something you can work on. So say you’ve got the standard and a PDF, you can kind of work and know that as long as you’re compliant with it, things are going to work.
Robert Seacord 00:03:24 Yeah, it’s useful to have a copy of the standard. I would say for the entire period of time I was writing operational code in C, I did not own a copy of the standard. And so probably the biggest disadvantage of not having standard is knowing how to write portable code. So, you can sort of experiment with a given compiler and, get your code to work through trial and error and testing, but you don’t really know if that code is fully portable unless you are familiar with the standard. So that’s probably where the standard provides the most benefit.
Gavin Henry 00:04:02 Are you aware of any other languages that are standardized like this?
Robert Seacord 00:04:05 Oh C++ is standardized like this. ADA, COBOL, Fortran, these are all ISO languages. There are languages like Java and C# where the language itself defines the language and if you have a sort of difference between the C# standard and the C# compiler, the standard is considered to be wrong. So that’s the opposite approach from what C and other ISO languages take.
Gavin Henry 00:04:33 Before I move us onto the CERT C coding standard, are you familiar with what it takes for a language to be one for there to be desire for it to be standardized? Why isn’t there a Java standard or why isn’t there a Go lang standard or a Rust standard or something?
Robert Seacord 00:04:48 Yeah, well Java was originally developed by Sun Microsystems and sort of controlled by Sun Microsystems. So they wanted to sort of keep control of the language and the process. So they did things like have community groups to provide them input, but they’re still roughly under the control of Sun. And then Oracle, C was implemented by different companies at the same time and what was beginning to happen was there was a certain amount of divergence in the implementations. So the typical reason you standardize something is to sort of limit the divergence and try to have a portable version of the language that you could sort of easily move between different platforms and different compilers.
Gavin Henry 00:05:35 So you’re saying the standardization was done there because of divergence of the language?
Robert Seacord 00:05:41 Yeah, different implementations were going in different directions and there was a desire to sort of try to maintain the portability of code written in the C language so that you could easily port it from one compiler implementation to another and from one platform to another without too much effort.
Gavin Henry 00:06:00 And does it make a big difference on the compiler that you use for C?
Robert Seacord 00:06:05 Well, sure so to give an example, going into the C standardization process with ANSI and C89, there’s this concept called integer promotions in C. And roughly half the implementations at that time had taken approach called value preserving and the other half took this approach of unsigned preserving. So the committee eventually agreed on using the value preserving approach and then the roughly half of the compilers who took the other approach then had to change their compilers. And that also has a subsequent impact on the users of the language, right? Because now their compilers change, their version of the language is changed, right? And so they have to now kind of retest all their code and make sure it’s still correct given the changes required to standardize.
Gavin Henry 00:06:57 And when was that example from?
Robert Seacord 00:06:59 That was from late 1980s.
Gavin Henry 00:07:02 That makes sense. Okay, so the CERT C coding standard.
Robert Seacord 00:07:06 Yeah, so CERT C coding standard was something I worked on at Carnegie Mellon University in the CERT Division of the Software Engineering Institute. That actually came about, I had started working in the C Standards committee in 2005 and about a year later I went to a meeting in Berlin and Dr. Thomas Plum approached me. Tom just recently passed away, which is sort of sad.
Gavin Henry 00:07:31 Sorry, Taylor?
Robert Seacord 00:07:32 Yeah, he was a really great man. So he approached me with the idea of CERT creating a secure coding standard and I thought right away that’s a great idea. And the concern at the time was the only thing out there was from Misra, which is more of a safety related standard, and they had their outs with the C committee at the time. And so I started the project back at the SEI and we published a couple books with Addison Wesley. The first edition was not very good, so if you own that, I’m sorry, but it sort of got me a seat at the table and we started a study group in WG14, the C Standards committee. And we met every couple week for about three and a half years with security experts and analyzers and compiler vendors. And we kind of ironed out a much better version of the CERT C Standards. So we published that in two places. One is the second edition of the CERT C Coding Standard with ASI and Wesley. And we also published it as a technical specification with ISO called 17961. And the ISO technical specification was more targeted towards sort of the analyzable subset of the rules while the CERT C coding center is a little more targeted towards developers and what they needed to do to ensure the security of their systems, even say in cases where the enforcement of those rules was not easily tool analyzable.
Gavin Henry 00:09:00 So if you had to program in C to the C Standard, your program might not be secure?
Robert Seacord 00:09:06 Oh yeah. It’s pretty easy to write an insecure program in C. It’s pretty easy to write an insecure program in any language. There’s a lot of things people aren’t aware of until they sort of develop a bit of a security mindset. And the start of that is that the user is actually out to get you in many cases. So whenever you take an untrusted input from a user/possible attacker, you have to be very careful with those inputs. You have to make sure you validate them; you have to be very careful how you use them, or they might result in some sort of exploit. And that’s true of really all languages, all programming languages.
Gavin Henry 00:09:47 So why is C still so popular if it was created when you were seven years old?
Robert Seacord 00:09:53 Well, it’s been doing the job. I mean the kind of advantage of C that makes it attractive is that it’s a small, relatively simple language. It’s very fast and it just does what you tell it to do. It doesn’t do anything else. And so frequently people need that level of control over their programs because they might have to run fast, they might have to be close to the hardware. So, C fits a niche that is still sort of in important today.
Gavin Henry 00:10:27 Maybe that’s like in software engineering every 20 years we kind of reinvent things with a new generation. So maybe with the evolving landscape of languages getting taught at universities and startups and things, it’s just not getting so many new projects. I mean should someone start a new project in C today or is C just there to be maintained? What’s your view on that?
Robert Seacord 00:10:51 Yeah, I mean I’m going to give an answer that sounds like a non-answer, but you should start a new project and see if C is the right language for that project. And so there are areas where C is quite, well I’ll say it’s sort of best suited still, right? So for example, if you are developing safety related systems, it’s still a little premature to build those systems in Rust because the ecosystem around Rust is not fully developed yet. So you don’t really have the Rust standards that are required for certification. You don’t have the certified libraries and a lot of the components that you need to build, safety critical software. So, if you have a team of developers and they’re C language experts and you have to develop a safety related system, then C is your best choice. You can’t underestimate skill sets and things like that with the developers. You have if you took a bunch of expert C programmers and you asked them to build a system in Rust, chances are good that that system won’t be as good as the C system would have been just because of the familiarity with the language by the developers.
Gavin Henry 00:12:02 So are we saying if I understand that right, pick the right tool for the right job?
Robert Seacord 00:12:07 Yeah, that’s a good summary.
Gavin Henry 00:12:08 The niche might be safety related systems. And what’s the definition of safety related systems? Is that medical aviation orÖ?
Robert Seacord 00:12:16 Yeah, medical aviation are both good examples. I work in automotive, so automotive is a good example. So yeah, there’s still a variety of safety related domains in which C and C++ are the safe choice, right? And Rust would be considered a really sort of risky bold choice to make it this time and eventually Rust will get there, right? But it’s not necessarily there quite yet.
Gavin Henry 00:12:43 Yeah, that surprised me because normally you’d think with a safety standard and all the things you hear about the exploits in C that it wouldn’t be used in a safe environment, but because you’ve got the libraries that are certified and things to reference to create and sign off in medical, I completely understand what you’re saying about the immaturity of some ecosystems, right?
Robert Seacord 00:13:05 Right and Rust has some advantages over C and C++, it’s designed to be sort of a memory safe language. The joke is that programming is hard and Rust enforce that at compile time. Maybe the joke was programming is impossible and Rust enforce that at compile time that would’ve been funnier. So yes, there are advantages of Rust there, but many of the common sort of vulnerabilities are still quite possible. So, no language that I’m aware of enforces input validation, right? That’s always left to the programmer. And so, all languages there are no secure languages, right? They’re all susceptible to exploits in one sense or another.
Gavin Henry 00:13:50 Yeah, I suppose there’s a tradeoff between developing a safety related piece of software and the time that takes to do it extra versus some of the quick wins you get in the other languages that aren’t certified but give you a lot more of the starting gate as it were.
Robert Seacord 00:14:07 Right? Yeah, I mean C is not a scripting language, right? So you would probably be more likely to use Python or something like that. It’s not a web development language you would be more likely to use JavaScript or TypeScript or something like that. So, languages all have kind of their sweet spot and that’s still true of C today. There’s still various applications which are really best suited for C. And of course there’s the however many 50 years of legacy code out there, which basically is what the world runs on is old C code.
Gavin Henry 00:14:43 Yeah, exactly. And is there, if somebody asks you where I shouldn’t use C, is there something for that or is there never a wrong place to use C if what you’re doing?
Robert Seacord 00:14:53 Yeah, there is. I mean I’ve never seen, well I guess never is a strong word, right? Its pretty unusual C used for a browser, web browser application, right? That’s pretty uncommon and I wouldn’t recommend it. And it’s probably not the best language for scripting type programs. Things that you have to kind of get up and run quickly. So, different languages again have their roles and there are applications that are not well suited to C.
Gavin Henry 00:15:21 Well thank you. I’m going to move us on to our next section, which is everything new in C, given it’s so old, we’re always adding new things with the standards. So the latest standards I didn’t clarify before, but you called it C89, so I presume that’s 1989 and now C23, which is the latest standard is obviously 2023.
Robert Seacord 00:15:43 Right? We have our own Y2K problem with the naming of the standards. So once we get up to 2089, there’s going to be some ambiguity in the names and, given how long COBOL has been around, I wouldn’t be surprised if we did get there eventually one of my grandchildren may be running the C committee by then.
Gavin Henry 00:16:04 He’ll definitely still be around because they’re not going to rewrite some of the things that are written in C are they?
Robert Seacord 00:16:09 Right.
Gavin Henry 00:16:09 So now that we’ve had a good refresher on C bearing in mind that we last spoke in 2020, four years already, what does C23 give us? Or in fact as we’re discussing C23, if there’s something that you want our listeners to be highlighted on that’s new from C21. So the standard from 2021, if you have any time feel free to mention that.
Robert Seacord 00:16:31 Oh well there is no C21. So the current version of the C Standard is C17. So that’s the last published version. C23, we finished work on that in 2023 and it’s almost published. So I think we have a commitment from ISO to complete the editing process by December of this year. But it’s been a long road with ISO. This has kind of been sort of an ISO issue because it’s not just our committees, other committees that are sort of being affected by these long editing schedules.
Gavin Henry 00:17:09 And that’s the ISO International Standard Organization isn’t it?
Robert Seacord 00:17:13 Right, over there in Geneva.
Gavin Henry 00:17:15 So attributes, you were looking at them in C11, but they didn’t get standardized then.
Robert Seacord 00:17:22 Yeah, we ended up adding some keyword sort of specified attributes. C++ on the other hand, did wind up adopting a standard attribute syntax, but we did get around to them finally in C23. And it’s a useful feature. It’s kind of new so people keep coming up with new things to do with it, right? Because it’s got that shiny new car smell.
Gavin Henry 00:17:46 I think you’ve mentioned cars, everyone listen to this podcast in a car or walk or run are going to be good. What is an attribute? So if you want to define it for us, that’d be cool.
Robert Seacord 00:17:55 Yeah, so an attribute is a way to give sort of information to the compiler in a way that if the attribute isn’t supported, it’s not going to affect the outcome of the program, right? So a lot of times they’re just sort of hints to the compiler that it can use for optimization, things like this. So examples of C attributes which are also present in C++ include maybe unused, no discard, meaning that you shouldn’t discard a value return from a function. And if you do, the compiler should issue a diagnostic. Again, it changes the behavior of the compiler but it doesn’t affect the type of code that’s generated. There’s a deprecated attribute kind of has a similar effect. There’s a no return attribute that you can use to indicate, say for example, that the function calls abort along all possible control flows. So there’s no possible way for that function to return. And as a consequence the compiler can now sort of make optimizations based on that information.
Gavin Henry 00:19:06 So is this syntax above a function signature or something like that?
Robert Seacord 00:19:11 It changes a little bit depending on what it applies to. So the syntactic location of the attribute determines, what it’s applying to. So it might apply to the function or it might apply to a function parameter sort of depending on where it’s positioned. So you need to look at some examples or look at the grammar in the standard to make sure that you’re placing the attributes correctly.
Gavin Henry 00:19:38 Perfect. So the next item I have my list is keywords.
Robert Seacord 00:19:42 Oh yeah, keywords. So C does something a little bit different than C++, which is, we have a reserved namespace of identifiers that we expect the users not to use. And so we’re pretty comfortable just clobbering those, right? We’ll just take it over for our own use because we reserve them. And the problem is that if a user, if a programmer has used that identifier right now you’ve got multiple definitions and that could cause problems. So we also sometimes want to use a keyword that’s not in the reserve space. And good examples of that are when we added the _Bool type, when we added static assertions, the static assert macro a line of and things like that. So what C does to try to not break user code is a process called uglification. And we really do call it that. So for example, _Bool, most developers would expect that to be spelled B-O-O-L, sort of like INT spelled Bool.
Robert Seacord 00:20:44 But we put an underbar and then we capitalized the first letter. So it was in C17, it’s spelled underbar capital B, lowercase OOL. So it’s a sort of very ugly version. And that also the underbar followed by capital letter is in the reserved identifier space. So if you’ve used that, we don’t feel bad about clobbering it because we’d reserved it. So we had all these kind of ugly keyword spellings in C17 and as part of C23 we replaced those with sort of the modern spelling. So now BO is Bool and static assert is static assert. And so now you can just use those keywords without having to include any particular headers. And again, we try to be very careful with C not to break existing code because, as I pointed out earlier, the world runs on C and no one on the committee wants to be responsible for breaking the world. So we try to be quite careful with these things.
Gavin Henry 00:21:42 And that was Jens Gustedt?
Robert Seacord 00:21:45 Yeah, Jens Gustedt I believe wrote that paper. Yeah.
Gavin Henry 00:21:49 In, yeah, I spoke to him I think six months or a year before I spoke to you in 2020.
Robert Seacord 00:21:53 Yeah, I think I saw him on your channel. Yeah, he’s a very strong contributor to C Standards.
Gavin Henry 00:21:59 That’s brilliant. Okay, so the next one, integer constant expression sounds exciting.
Robert Seacord 00:22:08 I’m glad you can be excited by integer constant expression, not many.
Gavin Henry 00:22:11 I’m hoping it spills over.
Robert Seacord 00:22:12 Not many people can, but yeah, so in C23 we added COS expert, which is familiar to C++ developers. We added it only for object definitions and not yet for function definitions which, fans of COS expert are immediately disappointed by that we didn’t go further. But part of the problem with insert constant expressions is that they’re not a portable construct. So vendors are allowed to extend it. So you could declare a function with say a constant INT and then use that object to provide the size of an array. And on some implementations that array might be Statically Sized Array and on other implementations it might be a Variable Length Array or VLA. So this is a good application for these new COS expert objects where if you declare your size is a COS expert object, now you’re portably guaranteed to not have a variable length array. So it improves the portability of your code.
Gavin Henry 00:23:17 Yeah. Because on some platforms, integers could be treated differently.
Robert Seacord 00:23:21 Yeah, even constant. So constant might be a constant expression on one implementation but not a constant expression in another implementation, which is confusing. And hence that’s sort of where the problem lies.
Gavin Henry 00:23:33 There’s no point in thinking of every scenario until it comes across and then you need to deal with it.
Robert Seacord 00:23:38 Right.
Gavin Henry 00:23:39 So the next one, uh, another exciting one for me, enumeration types.
Robert Seacord 00:23:43 Yeah, I like this change. So most of this change was just the ability to have type enumerations where the developer explicitly says what type it is. So prior to this change you sort of had to guess. So it really could be any, any type up to INT or unsigned INT. And some common implementations I believe for example, Microsoft visual C used a signed INT and GCC used an unsigned INT, right? So you got different behaviors on different platforms and that could of course affect portability and how you write your code. And so now with C23 you can give a type so you can specify this enum is unsigned short, this enum is an unsigned INT and now you’ve got more portable behavior because you know exactly what type is being used to represent the underlying enumeration object.
Gavin Henry 00:24:37 If you were to use these things now, I mean we’ve only gone through half of the list, is it just a case of using a compiler that sorts that supports or do you need to do something else?
Robert Seacord 00:24:46 Well you need a compiler that supports the C23 features, and you need to change your code. So you would have to go through your source code and add the type of specification to each enum that you’ve defined.
Gavin Henry 00:25:01 And the binary that’s produced is just a binary as how that’s always produced.
Robert Seacord 00:25:06 Yeah. It’ll compile down to a binary and the binary could well be different, right? If you’ve specified a type for that enum that’s different from what the default type would’ve been under C17 or older versions. The other thing to be concerned about is before you sort of modernize your source code to C23, you want to make sure that those C23 features are available on all the possible platforms that you’re targeting, right? Because otherwise you wind up doing more work with having defines and things like that and different sort of configurations of your program depending on your target compiler.
Gavin Henry 00:25:46 Yeah. And you end up writing three times the amount of code just to do the one thing depending on where it’s deployed.
Robert Seacord 00:25:53 Exactly. But it is a good feature and it does improve portability improves safety and security because it makes your program better defined and it’s always good to know what your code is actually doing.
Gavin Henry 00:26:05 Yeah. Make it easier to read as well because you’ve explicitly said what it is.
Robert Seacord 00:26:10 Right.
Gavin Henry 00:26:11 Next one is type inference.
Robert Seacord 00:26:13 Yeah. Type inference is sort of surprising to me that this became one of the more controversial new features in C. There are a lot of people who really don’t want C to change at all and a few of them are on the committee. So this is use of auto and it’s the same ideas in C++ but we don’t allow it in function signatures. And so what you can do is you can say auto, I equals zero L and the compiler will infer the type of the object based on how you initialize it. So in this case, if we initialize it to zero L, the L is the long constant rate. So it will declare this type as a long, so it’s sort of a convenient feature. It is sort of susceptible to abuse, which is why maybe some folks are not super fond of it, but it’s useful in sort of macro definitions, function like macros where you don’t know what the type of the parameters are but you want to declare an object of that type. You could simply use auto there. And it’s also useful in so generic programming, which in case you haven’t used a new version of C in a long time, we’ve had for a while now in the language.
Gavin Henry 00:27:29 Yeah, I was going to ask it. Because that sounds like in Pearl or Python, depending on whether the variable looks like a string or looks like a number or looks like an array, the verbal behind the scenes will change, right?
Robert Seacord 00:27:42 Right.
Gavin Henry 00:27:43 Right. So I’m just trying to understand what the reason for that was.
Robert Seacord 00:27:46 Yeah, mostly to support generic programming if I had to give a one-line answer.
Gavin Henry 00:27:53 And what’s the short definition of generic programming for those not familiar, including myself?
Robert Seacord 00:27:58 So two sort of generic features. So there’s the old school generic feature which are function like macros, right? Where say you define a swap function, a swap macro and most experienced C programs will know that the actual arguments that you pass to a macro can be any type at all. Because it’s just going to be sort of textual replacement. And as long as the resulting code makes sense, it will compile, and things will be fine. So that’s sort of the old school way of doing it. But in C11 we introduce generic selection statement where now you can actually sort of branch on the type of the parameters and invoke different code and that code will now be type checked and all those good things that come with not using macros.
Gavin Henry 00:28:47 Nice. And macros is the syntax you using pre-processing, isn’t it
Robert Seacord 00:28:52 Pre-processing, right. Yeah, pre-processing macros. So the pound sign define basically.
Gavin Henry 00:28:57 Perfect. Typeof operators. That’s one word, typeof.
Robert Seacord 00:29:02 Typeof, yeah, there’s typeof and typeof_unqual. These are similar to the deco type in C++. So it’s another way to let you specify a type in your code based on another type or the type of an expression. And so the difference between those two operators — and these are both operators — is that Typeof retains whatever qualifiers the original type have like Volatile or const or atomic and typeof_unqual strips the qualifiers including any atomic qualifier.
Gavin Henry 00:29:34 And that might be where you’re trying to define a variable to live in one source code file?
Robert Seacord 00:29:39 Right, right. One program. It’s another useful feature for macros or for generic functions.
Gavin Henry 00:29:46 Perfect. And second last is bit and byte utilities.
Robert Seacord 00:29:52 Yeah, this was done by John Heed. So there’s a new header called Standard Bit.H and there’s just a ton of new functions so we have functions that let you count the number of ones or zeros in a bit pattern, count the number of leading or trailing ones or zeroes test whether a bit is set, determine the smallest number of bits required to represent the value. Just a bunch of different, bit twiddling type of functions that are now standardized. And we also have a feature test macro Standard C Indian native that lets you determine whether your integer are represented using either big or little Indian representation.
Gavin Henry 00:30:34 This area of C and probably software engine in general. I’ve not done a lot of personally any sort of bite operations at all. What, where is that used?
Robert Seacord 00:30:44 Primarily in sort of low-level programming? There’s a book, I’m forgetting the name, I think it was called ìHacksî or something like that, really very clever book. But it talked about how you can write really efficient code with a variety of sort of these bit level hacks. So these sort of functions make that easier, but also just for this very kind of low level code where you’re dealing with hardware mass and those type of things.
Gavin Henry 00:31:12 The first thing that comes to my mind is where you’re doing embedded programming and you’re sending a one or zero to light up an LED or something different.
Robert Seacord 00:31:21 Yeah, I just had this thought for the first time, which is always scary to then record that on a radio show. But this idea that you can, that there’s a function to determine the smallest number of bits required to represent a value that could be quite useful with the new bit precise integer types we have in C23 where you can specify the exact size, the exact number of bits that’s going to be used to represent that type.
Gavin Henry 00:31:45 Well that’s a good idea. So we’re on to our last one now and I’ll need to get that link to that book afterwards so we can put it in the show notes. So our last one is the IEEE Floating Point support.
Robert Seacord 00:31:59 So we have a C Floating Point study group and thank God we have them because they handle all this floating point stuff that befuddles the rest of the committee. But the big change in C23 is that there are a number of technical specifications 18661-1-2-3 and those have all been now folded into C23. And so the part one of those technical specifications deals with binary floating point. The part two deals with decimal floating point and the part three deals with interchange. And so those changes are very extensive and have introduced all sorts of new identifiers and it’s very complete. It also sort of updates C to work with IEEE 754 2008 version of the Floating Point Standard. So that’s good. You kind of do this thing in standards world where you leapfrog each other, right? When we come out with a new standard, we see what other standards have published new additions since the last time we published and we try to update our standards to work with the latest, greatest things.
Gavin Henry 00:33:14 How many hours of your life do you have to commit this to know and read off those numbers by heart? Like you just did the standards.
Robert Seacord 00:33:21 Oh, listing them off is pretty easy. Like understanding floating point that takes your entire life. I mean there’s a really very small number of people who completely understand floating point.
Gavin Henry 00:33:34 And how do you describe it?
Robert Seacord 00:33:36 Well, floating point is a model that approximately models the behavior of real numbers but doesn’t really. It’s not the same as performing arithmetic with real numbers, but it comes close to it, right? And so that’s always good and bad, right? So people will use it to implement arithmetic using real numbers and normally it works and then sometimes it doesn’t. And where and how it doesn’t behave the same is why you need to be an expert in floating point. When you use floating point.
Gavin Henry 00:34:12 Floating point, is that like 10.2, 5, 3, 2, 1? Is that a floating point number or?
Robert Seacord 00:34:17 Any real number. So even like say 0.1 or 1.0 can be a floating point number. If it’s in a floating point type such as float or double and, if I just use 0.1 as an example. When you look at that number in binary floating point, you think that’s a really simple number. But it turns out in binary floating point, that number cannot be exactly represented. And so there’s a surface.
Gavin Henry 00:34:45 Yeah, but it’s 0.1. That’s what it is.
Robert Seacord 00:34:47 That’s what you think it is. But when you represent it as a floating point number, it’s not exactly that number, it’s something close to that number. And so there’s a certain lack of imprecision there which can kind of wind up biting you. It’s a weird story to tell, it’s a real-world example except it turns out it was, it’s not true. It was different, but it’s a good story anyway. But back in the 1990s when we had the Desert Storm, the US invasion of Iraq, there was a scud missile launch on some barracks in Saudi Arabia and the barracks would be defended by the, this patriot batteries. And it turned out the patriot batteries had failed to intercept one of these scud missiles which hit the barracks and caused some deaths among the US troops there. And so this problem, this defect was originally attributed to floating point representation.
Robert Seacord 00:35:44 And that because these batteries had been continually operational for a long period of time, as it ran the floating point values became increasingly imprecise to the point that eventually the missiles failed to intercept the scud missiles coming in. The reality of it is very close to that. The reality of is that it was actually an imprecision in these floating point numbers, but it then caused the conversion to an integer to be incorrect. And so the solution to that was to turn the batteries off and on every so often. So, these imprecisions didn’t accumulate and then eventually there was a software fix to solve the problem.
Gavin Henry 00:36:29 That sounds like the age old IT fix, just reboot it.
Robert Seacord 00:36:34 Yeah, just turn it off and turn it on again. Yeah.
Gavin Henry 00:36:36 Okay. That’s the last one in C23, but did we have we missed anything?
Robert Seacord 00:36:41 You know, you probably missed my favorite and maybe many’s people’s favorite, which is starting with C23, the C language only supports two’s complement inter representations. So, as of C17, the C language supports two’s complement, one’s complement and sign and magnitude. And this is one of these changes that I didn’t think I would live to see.
Gavin Henry 00:37:06 Yes, it’s a backwards step. Is that, not?
Robert Seacord 00:37:09 I don’t think so because like sign in magnitude representation that’s very, very outdated once complement is one that we thought was kind of still around. But we determined that really it’s not, it doesn’t exist in any current implementations. And so when you can narrow what’s allowed by the standard that’s really the purpose of standardization, right? Now the standard provides more portability guarantees, right? So all you as a developer have to do is write code that will work with a twos confluent representation and that code will now function correctly on any C23 conforming compiler.
Gavin Henry 00:37:50 Yeah, that’s a big win.
Robert Seacord 00:37:51 Yeah, I think so. So we did that and yeah, some other things but too numerous to talk about here.
Gavin Henry 00:37:57 Well the last question of this section was to talk about anything in C21 that is a must have, but C21 as it doesn’t exist?
Robert Seacord 00:38:05 Doesn’t exist.
Gavin Henry 00:38:06 We’ll leave that question, that’s a new one for me. Okay. So as we said in the intro, you work for Woven by Toyota. So I was wondering if we could talk about that your work there, how C fits in, obviously don’t talk about things you can’t talk about, but what is Woven and what’s their goal, as an elevator pitch?
Robert Seacord 00:38:26 Yeah, so Woven is in the Toyota group of companies. So my office is actually in Tokyo, although I work remotely from my home in Pittsburgh. Woven is basically trying to define what we call the software defined vehicle. And so historically cars have been sort of a collection built from a collection of components which are sort of developed by sort of a diverse ecosystem of vendors and then integrate it by the OEM, which might be Toyota or Ford or Mercedes or what have you. And the software is developed for each component and then basically discarded and then for the next version of the vehicle or different vehicle then a new set of components are developed. And that’s sort of an increasingly unviable way to build cars, which are increasingly software reliant. And so we’re trying to sort of flip the script on that and sort of define the vehicle in terms of the software and then provide the hardware that can then run that software in a safe, reliable way.
Gavin Henry 00:39:43 So that was very interesting about Woven. So they’re designing, just to summarize, they’re designing the software first, so it kind of grows with the vehicle rather than develop software for a component, then the new version of that vehicle has to get the same process done again and again.
Robert Seacord 00:39:58 Right, so this is the goal basically to be able to sort of preserve the software, evolve it over time and have sort of increasingly complex systems that have sort of like reusable software kind of following the concept of product lines if you’re familiar with that.
Gavin Henry 00:40:17 Okay. And are they working with other manufacturers to create some type of standard around this as well or is that too early?
Robert Seacord 00:40:23 Yeah, it’s probably too early. I mean, right now this concept is competing with, with sort of the current model, right? Where existing vendors sort of entrenched in the current approach and we sort of have to still sort of move the industry in this direction, right? So it’s a goal, it’s our goal, but it’s by no means accomplish the goal yet.
Gavin Henry 00:40:48 Understood. And where does C fit into all this?
Robert Seacord 00:40:52 Well C and C++ are probably the primary languages in which we develop automotive software just because, there is this established ecosystem around these languages and companies like Toyota are very comfortable building or developing automotive software in those languages in C and C++.
Gavin Henry 00:41:14 Is C kind of the power of the operating system or is that C++ or C just speaks to the individual OEM parts? I’ve always wanted to know how the systems in a car are linked and hopefully you are privy to that type of thing.
Robert Seacord 00:41:28 Yeah, to a degree they don’t really let me write any software anymore. So, I’m mostly involved in the, the coding standards and so forth, but on an embedded process you don’t really have an offering system, right? You’re just kind of on the metal. But there are things like automotive grade Linux which might be on a car and in that case the operating systems written in C because a line is turbo can’t stand C++, so it’s, you won’t have it in Linux in terms of how the systems, there can be up to say a hundred ECUs in a modern vehicle. So the vehicle internally has something called a CAN bus and that’s how the ECUs communicate. And the more modern vehicles have sort of more complex networks, which I’m not exactly sure what those look like, but there’ll be sort of subnets which are gatewayed off of other networks. So potentially your cyber physical safety components, ECUs will be gatewayed from the infotainment ECUs for example.
Gavin Henry 00:42:36 So you are involved in the standards around all the C code that is used in these safety related systems.
Robert Seacord 00:42:42 Yep no really all of it. So, the first coding standards I wrote for Woven were for C++ for C++14 and C++17. And I’m just actually now completing a C Standard around the C17 version of the Standard. And these are coding standards which integrate safety related coding standards such as Misra and Auto Czar and also integrate the Search Standards, which thankfully Toyota had already adopted the Search Standards before I began working there. So I didn’t have to sort of do this in modest thing of promoting my own creations. So that was nice.
Gavin Henry 00:43:24 And these standards, are they coding standards that the developers follow by reading it or are they helped with the compilers or the IDs or how does it enforce?
Robert Seacord 00:43:34 So for these Misra based standards and auto are based standards, we have something called a Guideline Enforcement Plan which goes through each of the rules and talks about how it’s enforced and normally it’s enforced by some sort of static analysis tool. And so examples of those include Code QL or Parasol, C++ test or QAC Helix I think is called. Those are some examples. LDRA also has a conformance analysis tool. And so we go through each rule and we point at which checker can check conformance with that rule. And then in some cases some of the rules which aren’t automatable aren’t enforced through code reviews and other quality assurance processes.
Gavin Henry 00:44:22 So these are done not at compiled time but through a separate tool, the static analysis, is that correct?
Robert Seacord 00:44:28 Right. Yeah, so, static analysis is run separately typically after you can successfully compile the code. And the reason for that is that some of these analysis can take quite a bit of time and the compilers are really focused on sort of quick turnover, right? Because people have a lot of edit, compile test cycles, right? And they don’t want to wait very long for their compilations to complete. So yeah, it is typical to sort of break those out into separate tools.
Gavin Henry 00:45:01 And this would be the tradeoff that for example, Rust made where they try and do as much as they can upfront in the compile bit versus, but that’s not safety though. It’s not safety related.
Robert Seacord 00:45:12 Yeah, I’d say that’s true. I mean Rust tries to prevent you from getting any sort of incorrect code to compile and the C ecosystem is not necessarily less safe, but it requires that you have a little bit more discipline in that once you get your code to compile, say your code compiles and it has a bunch of warnings, right? It still generates an executable and if you’re a really bad programmer, you might decide to deploy that right onto a system, that’s poor practice. Right? So first, compiler warnings are important you should address all the warnings first and then you want to do additional analysis, both static analysis, dynamic analysis testing to make sure that you’ve eliminated other categories of errors that you don’t want to deploy to your system.
Gavin Henry 00:46:08 Yeah, exactly. Just a question that’s popped out of that last conversation. Have you seen or is there, or do you ever envisage a way that you could plug in these standards at compile time to see, or for example, Rust instead of just enforcing what the language do you can enforce other rules that are safety specific or will there always be static analysis? Because that ecosystem is very mature.
Robert Seacord 00:46:34 I mean there’s probably no inherent reason why you can’t do it. You could quite possibly, Clang has the Clang analyzer, which is a stack in us too, right? I could envision Clang introducing a flag that compile a flag that says also invoked the analyzer. It’s not really necessary, right? I mean it makes more sense to my mind, right, that you do the compilation and at this point you’re trying to fix warnings, you’re trying to get kind of obvious syntax errors, this kind of thing. So you don’t want to do spend the time waiting for a complete analysis to finish. You just want to kind of fix these problems quickly. And then once you get to the point where it’s free from warnings and its sort of passing some unit tests maybe before the unit test, I don’t know, then you can do the static analysis and you can look for additional harder to find problems.
Gavin Henry 00:47:27 Yeah, it might slow down productivity because the warnings are so not related to what you’re working on, but you’re going to fix them at some point.
Robert Seacord 00:47:34 Right, right.
Gavin Henry 00:47:34 Obviously C is still and always will be a very powerful language, has a strong history in deployment base. And if there is one thing that you’d like our listeners, software engineers to remember from the show, what would you like it to be Robert?
Robert Seacord 00:47:49 C is a strong and flexible language. It’s sort of a sharp tool and you can get a lot done with it, but you need to be informed and have a good understanding of the language and program safely.
Gavin Henry 00:48:04 Perfect. And finally, is there anything that we missed that you think we should have mentioned?
Robert Seacord 00:48:09 Yeah, run out and get a copy of † Effective C, 2nd Edition .
Gavin Henry 00:48:13 Of course. How can I forget Effective C, 2nd Edition ?
Robert Seacord 00:48:17 It makes a great Christmas present.
Gavin Henry 00:48:19 I’m looking at my one on the shelf and it’s not too thick, but it’s packed with so much information.
Robert Seacord 00:48:25 Well thank you very much. Appreciate it. That’s brilliant.
Gavin Henry 00:48:27 Okay, so people can follow you on X, I suppose. Now Twitter to us old school,
Robert Seacord 00:48:33 I’m still on Twitter for the time being and over there on Mastodon and LinkedIn and I’m not too hard to find people regularly shoot me emails or complain about the C language on Twitter and I’ll engage sometimes.
Gavin Henry 00:48:50 And if there’s an acronym that I’ve forgot to write down or put in the show notes and they want to reach out. Any of those in particular that you’re more fond of or hang around more or doesn’t matter?
Robert Seacord 00:49:01 I sort of look at them all. So however you’d like best, whatever system you’re on, whatever social media you’re on that you’d like to contact me, that’s fine.
Gavin Henry 00:49:09 Okay, Robert, thank you for coming on the show. It’s been a real pleasure. And this is Gavin Henry for Software Engineering Radio. Thank you for listening.
[End of Audio]