Search
Tim McNamara

SE Radio 644: Tim McNamara on Error Handling in Rust

Tim McNamara, a well-known Rust educator, author of Rust in Action (Manning), and a recipient of a Rust Foundation Fellowship in 2023, speaks with SE Radio host Gavin Henry about error handling in Rust. They discuss the errors that Rust prevents, what an error is in Rust, what Tim classes as the “four levels of error handling,” and the lifecycle of your journey reaching for them. McNamara explains why Rust handles errors as it does, how it differs from other languages, and what the developer experience is like in dealing with Rust errors. He advocates best practices for error handling, what Result<T> is, the power of Rust Enums, what the question mark operator is, when to unwrap, what Box<dyn std::error::Error> really means, how to deal with errors across the FFI boundary, and the various Rust error-handling crates that you can use to give you more control. Brought to you by IEEE Computer Society and IEEE Software magazine.



Show Notes

Related Episodes

Other References


Transcript

Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.

Gavin Henry 00:00:18 Welcome to Software Engineering Radio. I’m your host Gavin Henry. And today my guest is Tim McNamara. Tim is a well-known Rust educator. He wrote the book Rust in Action and received a Rust Foundation Fellowship 2023 in recognition of his efforts remote language. He also runs Accelerant.dev, a consultancy focused on the Rust programming language. Tim hosts a YouTube channel podcast and is active on most social media platforms as Timclicks. Tim, welcome back to Software Engineering Radio. We last spoke in 2021. Is there anything I missed in your bio that you’d like to add?

Tim McNamara 00:00:58 No, no. Thank you very much for the introduction.

Gavin Henry 00:01:01 No problem at all. Okay, so today the show is talking about Rust error handling and Rust. So I really enjoyed your talk for Rust Nation UK, which is about Rust Errors. So I thought I’d like to get you on the show again because it really helped me understand them. So I think it’ll be great for our listeners to not only explore errors and software and bugs in general, but how Rust helps us with them to avoid them and give us the flexibility to do what we need to do.

Tim McNamara 00:01:33 Well look, it’s a pleasure.

Gavin Henry 00:01:35 So I want to lay the foundation before we move into what you called the four levels of errors in Rust for that talk you did for Rust Nation UK. So let’s start bit of a high level one. What is an error?

Tim McNamara 00:01:52 This is a deceptively difficult question to answer. That’s probably why you’re starting with it because there are almost 2D, well there are multiple different, right? So if we think of, let’s just say that we have, we think of a simple program as started going through some a series of instructions and then terminating. An error we could define it as essentially at one of those steps, some input it receives something that it does not understand or does not expect and so, or there is some precondition that isn’t satisfied and then the program cannot continue to the next step. Essentially you can think of an error as a fork in the control flow and error handling as being a way to essentially split the program apart and down the happy path let’s say. Or the main path is the control flow that you sort of intended and down all of these error paths, which also could be used, we could use the term exceptional paths.

Tim McNamara 00:03:05 Which kind of hinted the name of exceptions are smaller, less likely cases which we have thought about beforehand as programmers of essentially ways of dealing with problems that we expect that we might encounter. For example, I talked about input and I meant that in a very general sense, but if you are a web framework, you expect that people will send you all sorts of nonsense from the network. And so your web framework should deal with input that does not satisfy let’s say HTTP or, and the definition essentially of what is valid input and what is invalid input sort of slightly changes at each step. But the other side of what an error might be is something that is completely unexpected or the program might encounter a state in which it’s impossible to continue down any path. And the typical response, most programs at that stage will be to just crash. The operating system will just kind of close everything down, close all of its files, shut down all of its network sockets, free all the RAM and essentially ask the user to restart the program. And the way that I think of errors is to think of them as a sort of forks in the road or from what we intend or expect.

Gavin Henry 00:04:36 Perfect. And with something like a program or a Rust in particular, assuming that’s an input is a certain thing that could also be covered by what you’re saying, where the inputs are wrong as in and dynamically type languages you’re assuming it’s something but it’s not.

Tim McNamara 00:04:56 Right. We have more inputs than just the data itself so that every block or every statement or every expression essentially has some extra material like the data type that it expects. And if you pass a string into a function which is really trying to do perform mathematical operations, you’ll likely to have a very bad time. Perhaps not if you, although there are some languages which have essentially said that our approach is to just accept that and do the best that we can irrespective of the input. And so my understanding of Pearl for example is that if you pass in a string to a function that accepts something numeric and it uses ESKY literals, so essentially you have numerals in the string or at the start of the string, then Pearl will just accept that as a number itself. And so there are huge philosophies essentially of software and what is acceptable or permissive in a large spectrum.

Tim McNamara 00:05:59 And that word dynamic is really important because Rust is a very static programming language. It’s very fussy about data types. It’s easy to laugh at Pearl and say I can’t believe that you treat a string as a number. But Rust on the other hand is almost the opposite. It’s embarrassingly fussy, it distinguishes between the specific bit widths of different integer types. And so the difference between a 32-bit integer is just as different as let’s say a string and an Int in say let’s say a Python or anything else. In fact, it’s worse than that. In Rust, you could have a data type which is defined as being 64-bits wide. So it has exactly the same representation as the native bit width of the CPU. So most programs are running on 64-bit CPUs. So there’s a data type which refers to an integer that is exactly 64-bits wide on every super U architecture.

Tim McNamara 00:07:05 And there’s a data type which there’s assume the size that the CPU architecture is the same. Assume the bit width of the native width of the CPU that you are being compiled for. Sorry, that’s a very long couple of sentences to say that. You could have two values that both have, let’s say the number 42 encoded as an unsigned integer with 64 bytes. And this pointer width data type, which we called a U size, every single bit is identical. All of the widths are identical and yet the Rust compiler will refuse to allow them to cooperate. You couldn’t add one with another. There is no implicit integer casting or integer promotion in the Rust language. And so Rust is very strict and it makes things like error handling very frustrating for beginners because it’s so pedantic.

Gavin Henry 00:08:07 And why does Rust handle errors there the way it does?

Tim McNamara 00:08:11 Because essentially the language itself or the language ecosystem has said that if we push as much work to me the compiler, if we can avoid problems then we should. And if you have some cases where for example, I might compile my program on a 32-bit CPU and so now the bit representation of my 64-bit type and my 32-bit type are different. And so this might imply some different semantics. And so now the inputs to the program actually include the CPU and the operating system that the program has been compiled for. And Rust is attempting to reduce the possibility of error by enforcing as much as it can at compile time once we remove the possibility of errors from our program. And that unfortunately involves being fussy sometimes then we have programs which run exceptionally fast as fast as they could by writing native assembly or even let’s say C code without the possibility that it could explode in your face, essentially. Because we don’t need error checking in so many places because there are guarantees which are provided by the language itself and are checked beforehand before it’s even run.

Gavin Henry 00:09:40 So I presume the Rust creators did it that way because they’ll have had experience with C++ and other languages where at the first pass you’ll have written your program to do what you’re trying to do, but then on the second pass you might be thinking about, oh hold on, if this is running on a different architecture, then we need to cater for what you’ve just explained, and might be different. So we need to do some checks for that and we need to do some checks for this. And the Rust creators push that all to compile time because they just been caught out so many times. Do you have any insights as to the history there?

Tim McNamara 00:10:17 So I know a little bit about the history, but I am probably second or third hand, but let’s go. The Rust programming language was a sort of essentially an experiment when it’s very early stages far before 1.0 was an experiment on running programs currently safely and it, there was an actor model, so it looked much more like an erling or GO than what emerged after the project was adopted by Mozilla for this experimental replacement for the Firefox web browser. It became infected is a terrible word but all of a sudden, a large number of C++ developers swooped in on the project and sort of said, look, if you really care about this thing, it needs to go super quick and we need to prevent a whole bunch of bugs that C++ does not. And one of the bugs that C++ allows because of its C heritage are subtle problems related to what is technically called integer promotion or essentially implicit type casting.

Tim McNamara 00:11:27 And if you allow that in the language, then you will encounter situations at runtime when the assumptions that you have as the programmer and the assumptions that the runtime has are going to differ. And so you’ll encounter bugs. The Rust designers at that point made a decision after essentially using the language to build this new browser engine that we should be quite strict on people because the language itself will find a home in the ecosystem of programming languages as being as fast as native code, but completely memory safe. And part of being memory safe or part of the kind of the philosophy of the language of providing speed and safety is that we should provide a very, a low overhead mechanism to handle errors. So in some sense, the Rust way or the Rust approach is in opposition to the C++ model by a large number of programmers who essentially became exhausted with trying to find in currency bugs, trying to find bugs with exceptions just popping out of nowhere or in other kind of irksome things that they had encountered along the way.

Gavin Henry 00:12:57 Okay, so thanks for that Tim. So now that we have Rust in the community for some years, well quite a few years, at least 10 and people have experience with the Rust approach, would the creators say it worked out or have there been changes the way things did since they started or?

Tim McNamara 00:13:15 Yeah, I think that there has been a general trend in the industry awards. I would say a functional style. The other thing which I haven’t mentioned is that Rust incorporates quite a few concepts from functional programming languages. And what we’ll see shortly is that Rust uses a very functional style from actually the ML family of languages that include Haskell and OCaml or dealing with its error types or the error process itself. And I generally think that a lot of ideas from functional programming languages have been adopted or incorporated into more traditional programming languages, let’s say a Java or a C sharp or C++, these languages which have typically been associated with either imperative or object-oriented styles used heavily in quote unquote enterprise software.

Gavin Henry 00:14:17 Yeah, I spent a lot of time the last year with Elixir and then as I started getting into Rust more this year, it felt pretty similar with a lot of stuff. And I quite like that when you’re talking about functional side of things.

Tim McNamara 00:14:32 Yeah, Elixir is a fun programming language. I have quite a few, I’d say months verging onto years of mothballed Erling and Elixir experience.

Gavin Henry 00:14:42 That’s a good lead into my next question. So as a polyglot multi-language programmer, which we both are, I would say we have opinions about dealing with errors in those languages. Are you not Rust, be it good or bad? If we were to step back and push aside that we love Rust, how would we classify the developer experience for handling errors in Rust? Not knowing the whys that we already know.

Tim McNamara 00:15:12 Yeah, I think that the developer experience hurt in two places. It’s uncomfortable when you start. It’s different and it is slightly unclear how to structure your programs because you don’t get the same thing as exceptions. You’re kind of in this unfamiliar territory where it’s difficult to think about the way that your program is structured and the way that you want to implement something that would be easy in Python, let’s say, or easy in whatever programming language, you come into Rust and suddenly you can’t do that. And specifically I’m talking about handling exceptions. The other thing that’s quite challenging is once you have written quite a lot of Rust and you are starting to write your own libraries, you need to start to learn how to compose different error types. Now this sounds, slightly absurd to people that are unfamiliar with what I’m talking about, but the very end, probably 15, 20 minutes of what we’re going to be chatting about with this kind of the fourth level of error handling essentially is a large number of third-party crates or sort of open source libraries have entered the ecosystem of Rust programmers to plaster over it.

Tim McNamara 00:16:35 It’s more, it’s more than just gentle sandpapering, it’s essentially rethinking how errors work. And so there have been several attempts essentially to create an ergonomic layer of error handling over the top. And the things that we started talking about beforehand about Rust being very pedantic and actually become exacerbated when you have upstream errors that come from different places, just very generally there is a different category. We might use the word class except Rust doesn’t have classes, but let’s call it a class of errors that are generated from Io input/output such as the file that we try to read from, we don’t have permissions to read from or the file may not exist. That’s a different kind of class of error sourced from a different module than an error that is related to housing numbers and strengths or formatting output, essentially converting data to strength. And so once you start trying to knit all of these different errors together, that’s a place where Rust feels very clumsy, it’s very bureaucratic.

Gavin Henry 00:17:59 My next question, are there any best practices for error handling?

Tim McNamara 00:18:05 I think the best practice is to work incrementally and kind of expand your knowledge and rather than jumping to an ideal state to stick within your mental model and then expand it as you learn more about the language. For example, Rust trait system and trait as sort of like an interface or an abstract base class. We might say something is an error, which we could say it, it implements the error trait is actually subtly different than what you might expect. And so if you try to go directly to an idealized best practice without developing a mental model, you’ll probably end up becoming lost and frustrated. And so that’s a, yeah, hopefully that isn’t me just kicking the question away. Another way to answer your question much more quickly is to say that the four levels that we’re touching upon are a progression and that eventually you’ll end up at the fourth level.

Gavin Henry 00:19:19 Yeah, that’s how I understood it. That’s perfect. Okay, so briefly, Tim, what are the four levels in relation to the best practices that we’ve just spoken about?

Tim McNamara 00:19:30 Yeah, I mean if we were to think about this is kind of, we are progressing towards some idealized state, which I said is the nice way to do it. We could start essentially by ignoring errors and just crashing the program. That way we move on to using essentially un-typed information and essentially providing a very, very basic something that feels like try-catch with a very generic catchall error. And then we can add some specificity. And the Rust way that we do this is by creating typically through creating, well enums and Rust are a little bit special. They are richer than named constant. And then the last level is to essentially, I said blaster over some of the rough surfaces that the language exposes. The problem with using an enum is level three is that we need to implement every trait or every interface or every combination of error that we might encounter. We also need to teach Rust how to print the thing to the screen. And that’s quite clumsy. And so the latter level is essentially an extension of the third with simplified ergonomics.

Gavin Henry 00:21:02 Yeah, I think I made a mistake personally and gone straight for I should be using this library and not fully understanding what I’m doing with it. The only thing I know is which crate to use if you’re doing a library and which crate to use if you’re doing a binary for users. But we’ll get to that. So let’s start with ignoring Result , a Result type, type of Result?

Tim McNamara 00:21:26 Yeah, it’s tricky, right? So in source code, we’ve got capital Result and then angle brackets, capital T, close bracket and that T represents the type that the Result contained when some operation is successful. But it’s annoying because frustrating because the initial T feels like it should be typed but actually it’s, I mean it is a type but it isn’t directly clear what that actually mean. So yeah, it’s not new, it’s the language or at least it’s the convention. So blame the standard library authors for that.

Gavin Henry 00:22:11 So what do you mean by ignoring this? I mean what would it, if you’re using it, what would it look like and what are you ignoring here?

Tim McNamara 00:22:18 A Result is kind of a meta type where it’s a type that is has two sides, one is an okay side which contains this T that we’ve been talking about or it has an error side and we could think of a Result which is just a rapid type. It essentially is a container or something and we hope it’s the T, but it might be the E. And so we call a method on our Result object. And I’m using the term object just to mean value we call unwrap. And this essentially opens the Result up, pluck out what we hope is the T and we need to remember that Rusts type system is extremely pedantic and so we say we expect that the value inside the Result will be T and then I can assign my capital T some value of type T to a variable. And if I’m wrong in that assumption, then I crash.

Tim McNamara 00:23:31 That is one way to do it. Another is that there are some methods which take in a reference to something else. Let me have an example. Let’s say I’m writing to a file. No, we’ll go over the other round. We’re going to read from a file. So I start by initializing a buffer and I have a buffer variable and a file variable. When I call read on file, I pass in a reference to the buffer and then the method will return a Result. If the Result is okay, it’s likely to tell me the number of bytes that was actually able to be read. And on the error side will have some specific error message about the category of error. So for example, if I didn’t have read permissions on the file, essentially the buffer will remain empty and one way to handle the error is just to essentially just go ahead and hope that your buffer has been filled up and just essentially silence the program by assigning the Result that is returned from the read all to a variable which is never read again. In Rust, errors are values, they do not interfere with the rest of the program.

Tim McNamara 00:25:11 They are not like exceptions that will interrupt and break the flow. And so one way to deal with errors is assign them to variables which are never used again. And we can be very explicit with the Rust compiler, which will be very keen to give us a warning to say that it’s an unused variable by prefixing the variable name with an underscore to say that we essentially are discarding this value.

Gavin Henry 00:25:42 Okay, I’ll summarize that. If I understood it correctly. So the Result is something that a function, that either we write or we use from the standard library or another library, is returned so it’s not the Result of the function. The Result is a thing like you said, think of it like an object, inside that it could have happy path that okay or the other path which is an error. So we’re expected to act on that Result because that’s why it’s been signed the way it is. But if we choose for the simple way to handle errors, we just ignore it by assigning it to a variable that we don’t use and we prefix that with an underscore, so the compiler shuts up.

Tim McNamara 00:26:30 That’s right. Yeah. The difference between those two approaches, the one was to call unwrap on Result and the other was to use a variable which we never read from again, is that if you call unwrap and you are wrong and there really was an error, your program will crash. Though they are, they are subtly different approaches. Whereas if you assign a Result object, which is an error to a variable which is never used, then your program will proceed as if there was nothing wrong at all.

Gavin Henry 00:27:12 And we would potentially reach for this technique if we’re just prototyping something or arrange something quickly for ourselves we don’t want it to be bombproof.

Tim McNamara 00:27:23 Precisely, we would use an approach like that if there was, I guess it’s two ways to think about it. One, you’re just writing something that’s 12 lines long and the other would be that there genuinely is a situation where you don’t care if an error occurred because the Result either side is fine, let’s say. Essentially, I don’t know when that would be the case, but presumably, I mean it’s available to you but I don’t in practice it would, this is essentially just us getting started with the idea that a Result could be either this happy path or this exceptional case, the error side.

Gavin Henry 00:28:20 Okay, so that’s level one. Level two would be what we’re calling or what you’ve called returning strings as errors. What does that look like?

Tim McNamara 00:28:31 Okay, so when we start to, let’s decide to define a Result and we are not exactly sure what the caller or people that will be accepting a Result as the return value would want, we are still designing our API. One thing that we can do is to just push a string in there and then if we want to essentially quote unquote raise an exception, all we need to do is put any text we want that’s appropriate for the moment inside the error variant of Result. Then what callers will need to do is interpret the text on the calling side. So for example, let’s talk about that case of reading from a file. You could imagine a case in which the read method on file returns an error or it returns a Result which contains an error that is just a string. And then your question is, well what was wrong? And so you go and look in the string and say well, and we do our string operations. Does it start with permission denied? Well okay, if it contains that that substring then we assume that the error from the operating system was permission. It was some sort of permission error.

Gavin Henry 00:30:12 Perfect. So if we’re looking into the string as an error, we would be returning that type of error to a user. So it’s probably like a command line thing or something we would be writing.

Tim McNamara 00:30:25 Oh, so user in that sense is subtly different. This is the application code that is importing your library. So the user is not the end user but some other programmer oh sorry, some other code which is probably code that you’ve written. So if you are defining a Result type, which is quite easy to do, then it works in conjunction with methods on types that someone later on is going to be calling. And so the Result that you are defining now will be interpreted by some other code that you probably don’t have control over, especially if you’re writing an open-source code or you’re releasing the library to the world.

Gavin Henry 00:31:20 Given your example there, where you’re interpreting the string for the error, trying to understand it, you’d be at a point where you’re trying to look for something else, say, oh yeah, I understand it’s permission denied but, what’s the actual error number that I could look up or understand because you’ve got different levels for out memory or can’t write the ball system or wrong user group, that type of stuff.

Tim McNamara 00:31:48 Right and we can actually, so another, instead of returning a string, it’s the same concept. We could return an integer. In fact we could actually return, because we’re talking about file Io, we could return the libc error codes themselves that are defined by the operating system. Now the interesting thing there with your, what you’ve just said is that you kind of need to look up the docs to figure out what the specific number means. Essentially the code is not self-documenting or it’s another way of looking at that is that it’s open to a different class of errors. If you made a mistake, if you paused an error message as a string or as an integer but kind of misread something or you had a typo and suddenly you’ve made a decision based on a misinterpretation of the code that is being provided upstream, you open the door for another error, which is the error of handling your one now that’s a motivating factor into kind of the next step. Is that clear?

Gavin Henry 00:33:04 Yeah, I mean so we’re level two returning strings as errors. So we’ve gone past the step of just ignoring things, we’re feeling a bit braver, a bit cleverer. We thought oh yeah, let’s put some meaning into that error and then we’ve hit the next level where or actually we need to encapsulate some other type of information. In that error then leads us to level three, which is enums I think. Is that right?

Tim McNamara 00:33:30 Yes. And the reason why enums work in Rust is because enums have two properties. One is that they are these named constants that you’ve kind of known from every language where essentially instead of having a raw numeric code, we could give it a name and we could group all of those error codes into some category or, let’s not use the word but maybe a class now though that’s kind of one type of Ofum and Rust provides some sort of extra richness to this by actually allowing you to pack data into each of these variants. And so if you have ever used C has the idea of a tagged union whereby one bite is used as a numeric tag and then there’s a sort of a struck or you can push whatever you want in into the remaining data. It could be essentially a type that is any type inside the same sort of packet.

Tim McNamara 00:34:40 Now in Rust you also have one extra very nice property of enums because the compiler knows that there are let’s say nine different error types in our enum that we’re defining and you only handle six in your code. It will refuse to compile until you have added handling for the other three. And allow me to say that again. Rust will require that you handle all possible cases and will enforce this at compile time. And so your programs that use let’s say an Io error type, they will require that you have thought about essentially every class of error that could occur. And this is in some sense much more powerful than just passing around an integer because the compiler doesn’t give you a lot of support. Like the C compiler doesn’t really care if you’ve dealt with the integer correctly. Whereas in Rust we can actually ensure that you have handled the specific error variant in the way that potentially was intended because you have no need to pause any strings. The compiler can guarantee that you have at least thought about the cases that you’re likely to encounter. You can have a catchall pattern match which essentially says I don’t care. But you’ve opted into that behavior, which is essentially a stronger thought than the compiler saying, I expect that you don’t mind.

Gavin Henry 00:36:30 If we just go back to returning strings of errors, when we’re returning something from a function, we have to act on that or assign it to a variable and ignore it. So if we’re trying to pull apart, like you’ve mentioned, you return a Result and then you patch them up to act on it and check that things are as you expected, we can get in a mess if we’re trying to match on strings is how I would understand that. But with enums we can define what we want in that enum to mean what we want and then Rust will complain that when that enum is returned from the function we’ve called.

Tim McNamara 00:37:14 Yeah, exactly correct that let’s say that we keep talking about the same error type also the same scenario. But again we have, we’re reading from a file we might have permission denied, we might run out of memory. There’s a lot of different things that could go wrong. The file could have been deleted. We don’t want to have to interpret anything wherein the operating system itself can kind of do like it’s giving us the information and what the compiler does with the type system is require that you are lifted out of the problem of needing to interpret what it is that the operating system hast intended. It’s completely unambiguous and, and it sort of is a, this very liberating feeling where instead of matching on strings or kind of in, does it start with this or, we often code and let’s say TypeScript or JavaScript or Python or anything else or I sometimes say like any reasonable programming thing you’ll just compare against strength. Is it equal to permission denied or what have you and Rust can provide something that is significantly more robust.

Gavin Henry 00:38:50 Yeah, so I was thinking in in my head, going back to your C example, normally you’d do a match on if it’s error code one or two or nine or if it’s an even one, it’s okay if it’s audits this, if it’s negative it’s bad. So you have to think about all that and capture all that. Whereas that’s forced on you with Rust if you don’t just do an unwrap IE level one and ignore the Result. If you’re doing a match on that because you want to be robust, you pull apart that enum and act on every level of it or decide like you said and opt in to ignore three of the nine possibilities you’ve created that correct.

Tim McNamara 00:39:35 Yeah, that’s right. There is another thing that you’ve snuck in there, which I am quite keen to chat about, which is that in many languages and I was a GO developer for about a year and a half at one stage and people used to return negative numbers to indicate that there was an error. And the other thing that they might do, especially in GO is use the zero value to indicate that there is a missing that essentially a uni-initialized value. So for example, the empty string is essentially the same thing as saying no string at all. And Rust way to do that is essentially to wrap the integer that you are dealing with in another type and push a lot of work to the compiler and into the type system because if you forget to check that negative numbers mean errors, then you have a real problem later on in your program because the thing is numerically valid and your problems will only emerge when something, some crash will occur because something occurred that was not expecting a negative number and now you have to trace back, well where on earth did this negative number appear?

Tim McNamara 00:41:06 Like, oh well by the way it was actually something I should have dealt with that error when it was first created. And Rust is not permissive in that way and I think this is a huge benefit to robustness in software.

Gavin Henry 00:41:24 When do you reach for the enum method at IE level three versus string?

Tim McNamara 00:41:30 Yeah, I think if you’re a proficient Rust programmer very quickly, if you are just learning or if you are, if you’re still designing your library, you want to give yourself enough flexibility to be able to change things quite quickly and so you are spending more time generating errors than handling them. And so the string method was kind of this loosely typed or stringy typed kind of error type is fine, but I would generally recommend that most programs that expect to live more than two hours should have a custom bureau type defined for them.

Gavin Henry 00:42:20 I like it. Okay, so I’m going to move us on to the fourth level. I think we’ve covered what an enum is well, and how they help, because they force us to deal with things. Which can get in the way, but that’s a good thing when we get to that level. Is there anything else you wanted to cover off in enums? Because they are pretty core to everything before we move on to the library section?

Tim McNamara 00:42:45 Yeah, maybe a motivation of why you would go further. And so one of the things that you could do with this is use the upstream error as one of the variants in your own error type. And so essentially you’re creating a super error that combines or essentially allows you to compose other error types together. And this is two problems. One, it’s actually a lot of work to define, it’s kind of cumbersome in the source code. And then you have a lot of fairly mechanical work to be able to describe, essentially provide a representation of an error type in as a string. And one of the things that we haven’t talked about is that the error trait in Rust that is the error interface is primarily concerned with the ability to serialize the data type itself. That is all errors should be able to be printed to a log file and which is one of the reasons why string is able to be used as an error type. And in the case of an enum, a new type that you define a custom user defined data type has nothing implemented at all. You start by when you grow a type essentially when define it, you also need to define every single interface that it implements or every single trait that the type actually works with that includes how to print the thing to the screen and to serialize it as streamer type.

Gavin Henry 00:44:37 Thank you. So you’re now elite, you know what you’re doing with all the errors, you’ve hit level four and then you realize, oh, I don’t need to do this anymore. I can just reach for one of these libraries or I can completely rethink how I do it because somebody else has created this slide.

Gavin Henry 00:44:56 Is that how we get to level four?

Tim McNamara 00:44:56 Yeah, that’s exactly right. So I you’ve just gone and listened to nearly an hour of me saying, oh you better do all this heavy work and now I’m going to tell you that by the way, most of the work has actually been dealt with and the process of defining your own error type is quite simple these days with the help of macros essentially, which do the job of telling the compiler what to do so that you don’t need to write an enforce code. And I hate being one of these people that does the same thing, which I remember really disliking, which is essentially teaching all of the fundamentals before teaching this advanced cheat. However, I feel like in the case of something as fundamental as error handling, it really does help to understand what these libraries are doing, otherwise you mishandle them or otherwise it becomes magical and I am strongly opposed to magic in software.

Gavin Henry 00:46:03 Yeah, it’s the same as any computer book. You need to learn fundamentals before you can use the abstractions. I think. I agree. So in your talk we’ve got three main ones. First one is called this error, no spaces. Second one is AnyHow or one word and then eyre, the third one. We’re getting close to the end of the show. So if we could spend a couple of minutes on each three and where you would use them, then I’ll move us into our wrap up.

Tim McNamara 00:46:32 Yeah, no please, that’s absolutely fine. This error is a library for defining an enum, which represents different variants. And so for example, you’ll say that it has little helper macros for saying, this is how I want, this is the string that I want to print out if this variant is encountered or I am wrapping an upstream error. And so the thing that I want the compiler to do is to dispatch to the upstream error, for example.

Gavin Henry 00:47:09 And would you be able to change the upstream error to something that means something?

Tim McNamara 00:47:14 Yeah. You can essentially, it provides you with a couple of hooks to be able to simplify things or to make them more specific. So for example, actually if you look at the documentation in the front page, there’s an example which is a data store error. So we are writing a library that is a client library for some sort of database. And one of the types of errors that our enum has is a disconnect. So the data store, the underlying network docket has shut down the library that we are writing. We don’t know why that shut down. And so we actually essentially inherit from the standard library’s own error type. Inherent tends though is a very muddled word in that case. But essentially we wrap the standard library’s own Io error and then we provide our own context, which is data store disconnected and then essentially it then prints out the secondary or sorry, the inner error.

Gavin Henry 00:48:24 Yeah I was thinking of something like where you’re using a third party Json API or something and then you know when it returns to 500 it’s actually okay or something you can trample over that error.

Tim McNamara 00:48:38 Oh yeah, no, absolutely. That would be a perfect thing that you could do. You would given the opportunity to go and inspect whatever has been received, it’s like, oh, well actually it’s a 500 but let’s say 500s in this instance are actually acceptable because they have, the API does not respect the standard. So 500s actually are signals that there is an empty data set, for example.

Gavin Henry 00:49:07 And I understand this error to be used well, for you to use it if you’re speaking to users of your program. Are there error messages for mainline user something like that?

Tim McNamara 00:49:19 Yes. There’s sort of a kind of strange debate about which library should be used for application code versus libraries, which is a distinction which I actually don’t think is particularly helpful. But actually the latter two libraries that I, that we’re going to talk about now would probably in most instances be more appropriate or quote application code. This error crate is a way to define your own Result type, which provides a lot of specificity, but it has the downside that your code does, it involves us a kind of bureaucratic approach where all, every single instance needs to be handled. And in application code sometimes you don’t care exactly why there was a problem, you just want to know that there was a problem and maybe annotate the output with some sort of context or it’s like, well I was trying to do this. And that’s the log line that is printed out. And for those kinds of use cases, the latter two libraries that we’re going to be talking about are probably better suited.

Gavin Henry 00:50:42 Okay. And the next one would be AnyHow.

Tim McNamara 00:50:45 Right, so I would like to actually chat about AnyHow, an eyre as in Jane Eyre, the E-Y-R-E spelling as essentially being two subtly different versions of the same idea and you should use whichever flavor of that idea you prefer, but they are conceptually identical, they’re just different takes on the same idea.

Gavin Henry 00:51:11 So Tim, could you just redefine that for me? You just said that AnyHow and eyre rely on Rust on subtly different mechanisms for enum representation. What does that mean?

Tim McNamara 00:51:24 Okay, so what I meant there, and I apologize for my voice, I just kind of feeling a little bit sick. But the difference is that those first two rely on something known as a trait object which essentially means something which can, a trait object in Rust can implement an interface. In our case it implements the error interface or in Rust terminology it implements the error trait and error trait simply means that the thing can be represented as a string and can be then printed out to the console or logged to a file. But if you receive a trait object, say you might have the or the error might have been generated somewhere in the call stack, it arrives at your function and you’re trying to handle it, the only thing that you can do is essentially go and read the message.

Tim McNamara 00:52:42 You only are the actual concrete type that generated the error has been erased and you are only left with access to the interface. Whereas this error relies on a concrete type itself, some new type that you’ve created, which can wrap anything else that you’ve got. And this new type is represented as, something which you can go and inspect quite thoroughly. If you have ever used a C program, you might have encountered the term of tagged union whereby you have some numeric tag and then there is a data field or sort of a struct field right next to each other. And that is how enums work inside Rust, and you have access to go and check which tag the error specifically. So this error provides a significantly richer experience so you have much more control.

Tim McNamara 00:53:57 However, the downside of that is sometimes you don’t actually want control. You would like to be able to exit the program as quickly as possible or, to just know that something went wrong and its now time to abort the program and return a message to the user. And in which case you might have more success just with, let’s say this error, sorry, with eyre as in Jane Eyre or AnyHow which essentially is a way to be able to take whichever error was generated because errors are values and return that very quickly back up except, we can annotate to the string with some extra information. So often these other libraries have methods to be able to annotate or add context or kind of append to the message.

Gavin Henry 00:54:58 Okay. I’ve got a question about I’m doing a lot of work on using Rust from C and I’ve just hit this point where I’m looking to pass some more information back from Rust onto the C side and I am not sure if any of these techniques are applicable, and I’ll just explain what I’ve done. So we’ve got an example of which we could reach for, if any. So I’ve got this project it’s as low as you can go and see. It listens on a socket for TCP and UDP pretends to be a telephone system and then logs bad actors trying to get into it to make a voiceover IP phone call. Now for a long time I wanted to add TLS support, so sip TLS because that’s a modern feature. I’m totally scared to do it in C.

Gavin Henry 00:56:02 It’s a lot to do in Rust. So my journey so far has been pretty crazy. Learning Rust this year, learning how to add things like Extern C, no Mango, how things compile, calling Rust functions through the use of a header that I generate initially, manually calling and using non from C then I’ve switched to buying bindgen to generate it, then I get into build the RS which is a way to automate things with Rust. And then because this lesson, TLS function actually logs things, I need to then go back on the Rust side to call existing C code to log things to SL like a file, etc., rather than rewrite the same code. So it’s like a double foreign function interface.

Tim McNamara 00:57:02 Right, so you’re inside,right. So you’re coming back and asking for more information?

Gavin Henry 00:57:09 Yeah, and then so I’ve got to generate C-style strings on the Rust side which put it on the heap. Then I’ve got to free them on the Rust side and then I need to deal with errors.

Tim McNamara 00:57:25 I do not envy you, this sounds like, I can imagine why you’re asking for help.

Gavin Henry 00:57:31 So that was one of the drivers for the show because, the initial thing is just stick a box dynamic standard error on main and that is no good when you’re calling a function from C.

Tim McNamara 00:57:43 Sadly not. No.

Gavin Henry 00:57:45 So what I’ve done on the Rust side and maybe some of these crates can help me is I’ve just made any function I call on the C side returner a 32-bit integer and I’m just doing libc exit success and exit failure and then printing an error, to standard error on the Rust side. But what I’ve just reached for on the Rust side is just unwrapping things, let it abort because it’s an important thing, like it couldn’t find a TLS certificate and a private key. But then on some of the stuff we’ve chatted about today, we’ve got the question mark operator that passes the error message up the chain a bit, which I can’t really use. So just looking for any advice given on what we said here. Do any of these crates, pass the FFI boundary?

Tim McNamara 00:58:40 Not really. So what you’re really asking is can we use the rich sort of these nice features that Rust provides and sort of pass that across the FFI boundary and essentially apply that in C. We can’t do that in a way that makes the C code aware of what Rust is doing. So I’ll just possibly explain things to your listeners. For example, a couple of the terms there were extern C and no Mango, so the idea here is that an Extern C is an annotation on the Rust code, which tells the Rust compiler to use C is calling conventions. So we are essentially writing C code, no, we are writing a function that obeys Cís rules in Rust. For example, C defines exactly the order in which local variables appear and how they’re passed into registers and these sorts of things and bit packing and a bunch of other stuff that most programmers really don’t care about.

Tim McNamara 00:59:52 The no Mango says that tells the compiler not to alter the names of the symbols away from the name that have been defined in the source code such that you can go and access the same symbol names from the C side. So what we’re actually saying is, so now I’ve got this really difficult interaction. Correct me if I’m wrong, but you are establishing a connection. So you are writing some software in Rust that interfaces with a C library that does your Voiceover IP and then you want to be able to essentially use the T, but you want to pass in the, say the TLS connection that was generated on Rust into the C library. Is that how it’s working?

Gavin Henry 01:00:54 Well, the C is a binary.

Tim McNamara 01:00:57 Okay.

Gavin Henry 01:00:58 It calls a Rust function called lesson TLS, which is a Tokyo thing. So it just spawns off some threads. So, I pass in a configuration pointer into that function. So on the Rust side it understands whether we’re in debug mode, we need to print error messages, where the certificates live, et cetera. So I was just looking to understand if there’s any sort of more richness that can go between Rust and C.

Tim McNamara 01:01:34 Probably not at the raw level, but I think. Is the possibility of creating an abstraction layer between the two sides, between sort of as glue or something. And this is just a guess after literally hearing your problem, so this is an idea. You define a struct, which is kind of like, it might be like deep error report or something, and it could be either on the C side or on the Rust side, whichever is more convenient that has space for a, you’ll need to give it like a seat on the Rust side, the Rust standard library has a C string type and you’ll need to be able, so essentially it’s a pointer and then you’ll need a link field. And you should probably in this abstraction layer sort of as a link between is that before returning the function on Rust side, you go and check this global variable non null. This is kind of how, there’s a very C convention that we’ve defined some kind of hidden symbol, a global that we’re passing around and your Rust code very quickly checks is something defined there. And if it is, we then return an actual error that we can interpret based on some information and it might be that you send through an actual string or it might be error codes or it might be something else, but you want to put something in the middle.

Tim McNamara 01:03:27 And this actually is relatively common. So if you expose yourself to a fair number of crates in the Rust ecosystem, you’ll see that they define normally two crates to interface for one C library. There will be let’s say that we had some library that was VoIP just for Voiceover IP. And there was, so they would be on the Rust side, so there would be VoIP, the VoIP crate. Then on the C side it might be Lib VoIP, there’s probably a better name, but, okay, so those are what the APIs expose. However, the actual linking of the two sides is usually defined in on the Rust side is a second crate called DASH sys. So VoIP DASH sys, which will in read all of the C header files with a tool called bindgen, which you mentioned before, and generate corresponding structs on the Rustís side and use unsafe a lot.

Tim McNamara 01:04:38 And that all of the unsafe code will be tucked into this DASH sys thing. And the sys crates job is to expose and sort of an easy-to-use API and the easy-to-use API is actually defined in the other crate. So a relatively common pattern is to start with the API on the Rust side that you want to use. Then you define an intermediary crate that uses the API that it must use, but it exposes something nice and or nice enough. And your job will be to try and provide some sort of context struct probably through a global variable, but it might not be, it might be an extra parameter if you can sneak one in there somehow whereby, and it’s, again, it’s this one of typical thing. If not non-null, then we need to consider the operation has failed and essentially, we discard everything else. And unfortunately when you are navigating these two worlds, we can’t push Rust into some other world. We need to exceptionally, when Rust is pretending to be C, it needs to use Cís conventions. I’m very curious as to whether or not anyone listening actually has something, had a better suggestion because I think that this is the kind of discussion where people who do know a lot about this area have very strong opinions and I’m very interested to hear from them.

Gavin Henry 01:06:35 Perfect. Okay. I think we better wrap up. Thank you. I’m glad we did this and hope it helps others like me. Was there anything that we missed that you’d like to mention?

Tim McNamara 01:06:47 No, the only thing would be if you’re having trouble with Rust to give yourself a breather and essentially allow your body and your brain to absorb the language piece by piece. It introduces a lot of semantics that are different than other languages. And Rust can feel very strict and, in some cases, pedantic. And I think that’s because it applies subtly different rules than what you’re used to. And so, it will take time for your brain to kind of adjust and be patient with yourself as you’re learning.

Gavin Henry 01:07:26 Cool. Well people can follow you on Twitter. Is there any other way you’d like them to get in touch or?

Tim McNamara 01:07:32 Yeah, look, if you’re off Twitter or X, I’m also on Mastodon and also YouTube. I’ve increasingly am now making my content available on YouTube. And that’s a very good way to follow me if you like, worked examples because about once every couple of weeks I put on a live stream where we go through a small exercise and implement it and I try and fumble my way through any compiler that appear.

Gavin Henry 01:07:58 Yeah, I’ve watched some of those and joined them. They’re really good. Thanks for doing this. Thanks for coming on the show. It’s been a real pleasure. This is Gavin Henry for Software Engineering Radio. Thank you for listening.

[End of Audio]

Join the discussion

More from this show