Marco Faella, author of Seriously Good Software discusses how to create good software, using six different ‘qualities’ of code. Felienne spoke with Faella about reliability, space and time complexity, readability, reusability, and thread safety. They also discussed how these qualities are often at odds with each other when creating software, and how to deal with that. Properties of different programming languages and their relation to qualities were also addressed.
Show Notes
Related Links
- Episode 400 with Michaela Greiler
- Episode 367 with Diomidis Spinellis
- Episode 295 with Michael Feathers
- Book, https://www.manning.com/books/seriously-good-software
- Marco’s website, http://wpage.unina.it/m.faella/
- Marco on Twitter, https://twitter.com/m_faella
Transcript
Transcript brought to you by IEEE Software
Felienne 00:00:52 Hello everyone. This is Felienne for software engineering radio today on the show with me is Marco Faella. He’s a professor of computer science at the University of Naples Federico sickle mill in Italy. Marco has contributed to the well known scientific library for symbolic manipulation of polling. Hydro, you start classes on Java programming, Baila construction. And so for engineering to undergraduates and graduate students since 2005, Marco is also the author of today’s stopping the book, Seriously Good Software. Welcome to the show, Marco.
Marco Faella 00:01:26 Hi Felienne. Thank you very much for hosting me.
Felienne 00:01:30 You’re welcome. So your book is called Seriously Good Software. I think that’s something that we all strive for, but what is good? Exactly? What do you mean by seriously good software?
Marco Faella 00:01:44 I was trying to fill a gap that I felt in some types of computer science education. I have the impression that a lot of people come into programming with a sort of a simplistic view of the job where you just have to find a way of achieving a certain task, achieving a goal that you’ve been given and, uh, without a global view of the different forces of the different qualities that software may or may not have. And without, without a view of the trade-offs that are involved in writing good software. So that was the starting point for, for the book. And in particular, I think that a lot of curricular have very deep classes and, uh, talking about especially computer science curricula, I have deep classes about specific topics like algorithms or operating systems and so on, but I think students may end up with, uh, with a very vertical view of different topics and w without a global overview of their relationships and how to compare and contrast these different techniques. And when it’s more appropriate to focus on, on one aspect versus another aspect of the software. So I tried to provide a more, uh, horizontal it’s a point of view across disciplines across, uh, topics.
Felienne 00:03:17 Yeah. I want to zoom into that vertical and horizontal a little bit, because it might not be clear to all of our listeners, what exactly you mean by that? So could you briefly point out what vertical and horizontal mean here?
Marco Faella 00:03:30 Yes. Let me give you a brief overlook of the book for, for, for our listeners. Uh, so the idea is the following for, for this book. I have a one single example that’s developed in many different ways, along the book. And in each chapter, I focus on different software quality for, for instance, I have a chapter on time efficiency. Then I have a chapter on memory efficiency, and I keep refactoring the same example with respect to different objectives with, with respect to different software qualities, and trying to emphasize the trade offs between them. And so that’s what I call horizontal. It’s it’s across this of it’s across a, you have a little bits of algorithms, data structures of operating systems, concurrency, uh, software engineering mixed together, uh, and compared on exam on the same example, whereas by vertical, I mean the classical computer science curriculum, where you, you take, uh, a full class on algorithms and you focus an entire semester or something like that on big old complexity and the fastest way to do certain things.
Felienne 00:04:45 Yeah. So what you’re saying, I think is that university often prepares students for a really deep tasks like algorithms or databases, but what sometimes might be missing from a curriculum is how to apply them and how to make the right selection of what to use. Where is that a fair summary of the problem that the book addresses?
Marco Faella 00:05:05 Yeah, yeah, absolutely.
Felienne 00:05:07 So you mentioned this already a little bit, is that your book talks about different qualities that software can have. And while reading the book, I found it hard to really define what a quality is. So can you briefly explain what is a quality before going into detail of course, into the different qualities that the book talks about?
Marco Faella 00:05:28 Sure. So by quality, I, I know it’s a very general term, uh, I mean, any desirable property of software, any desirable property that a piece of software may or may not have, or may have to a different degree. And, uh, and most of these properties are traditionally called nonfunctional requirements, but I’m not fond of this functional versus nonfunctional distinction. I have to say, I think most of these they become. So when you say the property is known, functionally, it sounds to me as if you are diminishing, it’s important. You’re saying it’s optional, it’s nonfunctional. Whereas I think they’re, they’re important. I think most of them, in fact, become functional properties beyond a certain threshold. For example, take a time performance time. Performance is usually in most systems in nonfunctional property because you don’t care whether your software gives you a service in one millisecond or two milliseconds.
Marco Faella 00:06:31 Most of the time, of course it depends on, on the context, but most of the time you don’t care, but still even if it’s a, I don’t know, even if it’s a, if it’s a user oriented service that an interactive program will, you may not care about one millisecond versus two milliseconds, but you certainly care that, uh, that, that service works within one second or works within one minute. And if you get that service in a one hour, instead of one second, of course, that very poor performance becomes a functional issue. And I think this applies to most nonfunctional so-called nonfunctional requirements. They can all become functional beyond a certain threshold if, if they’re not properly satisfied.
Felienne 00:07:14 Yeah. So to summarize, I think what you’re saying is that qualities are quite related to nonfunctional requirements, but that you’re not phones if there’s nonfunctional Durham, because it seems to indicate no Barton’s or norm core to design. Whereas you rightly pointed out that even something like a reliability or time optimization can get functional once it gets too much out of hand.
Marco Faella 00:07:39 Yes. Yes. It’s, it’s a, it’s a fin, a distinction. It’s a, yeah. A requirement can jump from one side to the other quite easily. It’s not, it’s not clear. Cut.
Felienne 00:07:52 Yeah. Yeah. I actually think that’s a really important distinction where we try to say this is functional and this is North functional, but it isn’t as binary as that. It is possible for these two to move over to spectrum from functional to non-functional. So I was wondering, what was your goal in describing these qualities? So, so why was it needed to formulate the book in that way, along these lines of these different qualities? What does that add?
Marco Faella 00:08:21 First of all, I think it adds, it adds awareness to programmers. And of course the book is mainly intended for junior programmers or people fresh out of their school or their computer science or other kind of, uh, of school and, uh, organizing the book, uh, around this qualities. Um, at least my intent was to, um, make them aware of this different points of view on software, on this different requirements. They may have to, to satisfy, give them a, let’s say in a way of vocabulary of, uh, of, uh, of qualities to think about. So I’m, I’m very focused on this awareness idea. So before you can improve your software before you can write good software, first of all, you have to be aware that there’s a problem, or you have to be aware that there’s different ways to measure in, in the best case or evaluate somehow your software, there’s different points of views.
Marco Faella 00:09:37 Then only then you, you should be aware of the techniques that you can use to, to improve those qualities. And finally, on a kind of a third level, you should have the experience and the capacity to judge which quality to focus on among all those qualities that are always present somehow. So your software has to do his job. That’s the correctness, that’s the functional requirements. Then it has to do it in an efficient manner and so on and so forth. So you have to be aware of these dimensions, then you have to know specific techniques to work on those dimensions, to improve on those dimensions. And then finally you or your coworkers or your boss have to have the vision to pick the right balance among these properties.
Felienne 00:10:31 Yeah. So it adds a vocabulary to help people to make the right decisions. And you mentioned trade-offs also earlier in the show between all of those qualities.
Marco Faella 00:10:41 Oh, one more thing. This, this, this idea is also an invitation to improve specifications because I think often specifications focus too much on the functional aspects and too little on those other requirements. Whereas after the fact, perhaps during development or perhaps, uh, after deployment, or at some point, you realize that you overlooked some nonfunctional requirements that turns out to be important. And, and perhaps it turns out to be functional. It turns out to be crucial. So another direction where I hope the book can, can improve the standard practices is to improve specifications and push people to be more complete in their specifications to include these so-called nonfunctional requirements in the initial industry, in the software specifications explicitly.
Felienne 00:11:42 Oh yeah. I really like that. I like that where you’re saying this isn’t only relevant for the process of creating software. It’s gonna also be important for the process of specifying and designing. That’s also where you can use these qualities as maybe something like a checklist even have we thought about all the six qualities,
Marco Faella 00:12:01 Right? Yeah, exactly. As a reminder, we are, well, what do we need about space efficiency? Do we need anything? What do we need about readability then? That’s the tricky part. But yeah.
Felienne 00:12:11 So I think it’s now time to zoom into the qualities. I would like to go over them one by one, where we can talk about what is the quality. Maybe you also have examples of code basis in which this quality did well or didn’t well, so I would like to start with time efficiency. This is one that has got mobile already, a bunch of times. Those there’s really a difference between a query taking one second and a query to take one hour. So what, so what is your definition of time efficiency?
Marco Faella 00:12:39 Well, uh, of course, uh, time efficiency is simply, uh, achieving your task in, uh, as little time as possible. And, uh, I think it needs very little introduction to any programmer. I think most programmers are naturally interested in it, but perhaps, uh, too much. And so putting it, uh, as my first quality is a kind of trick to, to get people interested in the book, because I know that so many programmers are, uh, are just passionate and fascinated by, uh, time efficiency though. We, I mean, I, I’m one of them and I’m so happy when you, when you manage to, uh, to, to get a big speed up on, on, on your implementation. Uh, it’s great. It’s good for everyone. So it needs very little introduction. Uh, just if you want a specific example or a small anecdote, just these days I was browsing Twitter and, uh, a guy was complaining about the reaction time when he clicks on the right mouse button on windows.
Marco Faella 00:13:44 In most cases, you get some kind of context aware a manual. So it depends on where you click on your desktop on your, uh, start manual. You get some sort of special manual when you click on the right button, right mouse, mouse button. And he noticed that on the modern hardware, modern machine, sometimes you need to wait for, I don’t know, one second or two seconds to get that manual. And, and he felt there was a, a crazy, and, uh, I think I kind of agree with him. I mean, it’s one of the basic user interactions for, for, for windows or any operating system these days. So it should be really fast. It should be instantaneous from your point of view. So the operating system and, and other software that’s, that’s, uh, cooperating with it should cash those lines before you even click on the right button and it should come out immediately.
Marco Faella 00:14:42 And, uh, and so, I mean, it was interesting because then this guy went into a very deep debugging of the issue and, uh, it was very a capable debugger. He could find the source for, for these lag. And he, I think he posted some, uh, some bug report on windows and I think maybe they’re working on it or maybe they already fixed it. I don’t know. But, uh, that’s, that’s a very simple example of something that we all interact with. And, um, perhaps we’re used to it, we’re used to having to wait a couple of seconds, uh, to get that menu, but it, it shouldn’t be like that. I mean, there’s no reason why a modern, uh, I don’t know, S eight core, uh, five gigahertz machine. Should have you, uh, wait a couple of seconds to, to show a list of 10 items or 20 items.
Felienne 00:15:40 Yeah, well actually, um, well, I, you said in the beginning, that’s a time efficiency might be and, and an easy thing to measure. However, I do think it’s, it’s more complicated than it used to be. And not as you got me thinking of this context menu, uh, I also recently was wondering why it was so slow. And then I realized that one of the reasons it’s slow is that if you are connected to Dropbox, then Dropbox will go through the internet to see if you’re actually connected. And if you are, you get extra elements in your context menu, you get like share this link on Dropbox or stuff like that. So something where we might think, Oh, this is a local operation and the operating system should be able to do this really quickly. Oftentimes nowadays there’s also some network connectivity in places where you don’t necessarily expected. So maybe time efficiency used to be easy because you could just look at your first boat and sort of count the number of iterations in your for-loop. But now that we have async await. So I think from my perspective, time efficiency was easier in the old days than it is now. Do you agree? And this is something that’s a book also talks about the differences between measuring efficiency in your own code base and having stuff in different places.
Marco Faella 00:16:54 Um, yeah, that, that’s a very good, a very good observation. So, uh, in the specific case of the context manual, I’m sure there’s, there’s a rather complex architecture behind it because as you mentioned, different programs can, uh, plug in into that, uh, into that manual and, uh, and add their own stuff still. I mean, uh, you have the example of Dropbox and checking on the internet, whether, you know, whether it’s connected or not, but perhaps the, then the architecture should be changed or should perhaps, uh, I mean, I’m sure there’s good. Uh, there’s reasons to be like it is, but, uh, there, there’s nothing preventing the architecture to, uh, have Dropbox check the connection beforehand, even before on the background before you click. And then when you click, it just gives you the latest status. And if the latest status is unconnected, then too bad, it’s a, then you will click again and maybe you are, you’re more lucky, but perhaps a Dropbox could update every, I don’t know, 10 times a second, this, uh, this little cache of, uh, of items anyway, uh, in general about the book and what I do about time efficiency.
Marco Faella 00:18:12 I don’t have the space, uh, to, to cover all these very interesting, uh, aspects. You mentioned like a difference between him doing something locally or having some uncontrollable lag in between that. That’s certainly an interesting issue practically. Uh, on the other hand, again, since, since the book tries to cover all of these different topics in a short amount, in short number of pages, it’s like a 300 page book, which tries to cover from algorithms to software engineering, uh, and so on and so forth. So it should be clear that every chapter is just a short introduction to the topic. So for, for every chapter, for every topic, for every software quality, I don’t go into the details, but rather I give a, a, an overview of the topic. I apply it to my, uh, recurring example. And then I applied to some other examples. And then I give pointers to, uh, uh, to other great books where readers can really fill in the rest, fill in the details. So it’s really works like that due to space constraints.
Felienne 00:19:29 Yeah. Yeah, of course we understand that even the book is limited and this episode is even more limited in time. If you could share with us just like one takeaway from time efficiency, what is the most important thing that our, our audience should realize when considering time efficiency?
Marco Faella 00:19:47 Right. Um, so there’s a little there’s, um, it’s not like a great discovery, but it’s, uh, uh, a little observation that, uh, perhaps is not present in a traditional discussions about time efficiency. So, so the whole book, I, I, in the whole book, I focus on the small cord samples, not large systems, just I focused on a single classes. And most of the time when you, uh, study algorithms in school at the university, you focus on a single input output function. So you want to give an, a graph you want to output, I don’t know, the shortest path from one Vertex to another Vertex. So simple input output problems, given this input, produce this output. And that’s great. And, and on that kind of problems, you can apply all the asymptotic analysis, complexity analysis as notation. And so on. My small observation is that in practice, even on a single class.
Marco Faella 00:20:53 So even on, even in a very small unit, you have different methods operating on on that class. And each methods may have a different complexity. So you, you may have some, one method taking linear time and other methods to does something more complex takes quadratic time and so on. And, and the, my little observation is, is that sometimes you cannot optimize, uh, the complexity of all methods at the same time. Sometimes you have to choose, you can make one method faster at the cost of another method becoming slower. So that’s another level of trade off that you may have between different methods within a single quality, which is time performance. You may have to find a balance between the complexity of different methods. That’s something that’s a that’s shown in the book.
Felienne 00:21:51 Great. Yeah. That is a good thing to take away. That’s you can magically make stuff better. Often if you make something better here, then you might actually worse than somewhere, and it’s something in another place of the code base. So I do think that’s a good realization to make.
Marco Faella 00:22:06 Yeah. And I make this observation on a small unit because that’s the style of the book, but you can, uh, apply the same observation to larger systems where sometimes you can make one, a feature or one module faster at the expense of another one that’s going to be slower.
Felienne 00:22:24 Yeah. Yeah. There’s also, um, the same can be generalized from two methods to two classes or two modules or two microservices that are at all three to each other,
Marco Faella 00:22:35 Right.
Felienne 00:22:37 Then the next quality is quite related it’s space efficiency. So there are two, I was wondering, you know, what is your definition? And do you have another great anecdote of space efficiency and how it’s, how it is impacting performance of code basis?
Marco Faella 00:22:53 Right. So, um, yes. Space efficiency, of course, by space. I mean, memory occupancy. So basic definition would be well, okay, this is a radio, a little bit trickier because a very basic definition would be to use as little memory as possible. That’s a, that’s a very simple definition, but that’s usually not what you want in a, in, in software development. So a better definition would be, make good use of the memory of, of the memory you occupy. But of course that’s a better definition, but it’s also more vague because what do you mean by good use? Let me, let me hear, uh, make, uh, my favorite example about a space, a efficiency to, to clarify a little bit. So my, my favorite example is video games. I think video games are a fascinating pieces of software, not only as a player, but also as a programmer and as a professor.
Marco Faella 00:23:57 And in fact, I’m also teaching a game development class. So they’re very complex. Uh, they’re very complex software systems. And one of the characteristics of video games is that no matter how much memory your architecture or your, um, device has, you can be sure that there are video games that, that people, we write video games that will saturate that capacity that will saturate the whole computational power of your device. And perhaps even when you come up with a new architecture, a new console, a gaming console, or a new device, even before the device comes out, there’s already video games in production being produced that saturate the capacity of that machine. And the same can be said about graphic cards, of course. And, and this has always been true, like from, from the first video games, like space war from, from the sixties and the PDP computer to the latest games.
Marco Faella 00:24:58 It’s always been like that. So video games really pushed the hardware to its limits. And in particular, they pushed the, the, the memory, they fill up the memory, uh, and they always have a lot more data that can be fit into memory. So, so in video games, you find that, uh, there’s a lot of efforts to exploit all the memory available, to push as much data as possible in Ram as much data as possible in a video memory and so on and so forth. And what happens when you try to squeeze as much information as possible into your memories that you start to use special encodings for data. And that’s something I also show in the book. So perhaps you have a small integers, so you, you take a 32 bit integer and you put a tool 16 bit integers into one. Even if your language doesn’t have a short, uh, data type, for instance.
Marco Faella 00:25:59 So you start using crazy representations that squeeze as much information as possible into your bits. So you, you, you try to, in other words, you try to exploit every single bit in your memory. No bit is wasted. And that’s a, that’s very interesting. That’s a, that’s an interesting exercise. That’s also a very extreme path that, uh, you should in practice. You should only take if, uh, if you have no other choice, because when you start using those crazy ad hoc representations, you lose on many other software qualities. You are probably losing the readability because your software becomes that tricky to read, understand, understand, and therefore you’re losing maintainability, or you’re decreasing your maintainability and so on and so forth. So taken to the extreme space, space efficiencies. It’s a very drastic measure. It’s, it’s a, a really conflicts with, with other software qualities. On the other hand, in a normal program, you don’t want to, uh, uh, use, uh, too much memory without a good reason.
Marco Faella 00:27:14 So without going into those extreme representations, it’s still good to have general understanding of, uh, how much memory a hash table takes compared to how much memory NRA takes in most implementation a hash table. Of course, it’s, it’s much more, much more efficient for many operations, but it’s also many times larger than a simple IRA. Those are the kind of simple comparisons that, that every programmer should be, should be aware of, regardless of the specific domain calling all developers. There’s no telling what you can create when you upgrade your data platform to InterSystems. Iris, are you ready to build the applications you want? However you want them, are you ready to develop applications faster than ever collaborate, build faster and deploy more efficiently? Tomorrow’s next breakthroughs are waiting for you today. InterSystems Iris data platform ready, set code, start recording for free visit inner systems.com/try to try Iris.
Felienne 00:28:21 Yeah, I really liked this example of games because it’s, it’s sort of reverse, right? Normally you’re just, you have more to want to make, and then it takes up the memory or the, uh, the space that you have. But then for a game, it’s just, this is the memory, and this is the hard disc size and you have to fail and you’re just going to fill it as efficiently as possible. So I think that’s an excellent, like Palm pole of a domain is space. This space efficiency is just, it’s just really, really core. And it’s maybe overpowered under dimension. One, one thing you just mentioned, and this is a nice segue into the next quality. You said, sometimes being really space efficient, like using a very special encoding gums to the detriment of readability and readability is also one of the qualities that are in the book. So of course, here, we also want to know what is readability and can you share another of those anecdotes or domains about readability and to contextualize it,
Marco Faella 00:29:17 Readability is a, is a kind of a hypothetical measure because you can’t really measure it in any easy way, but, uh, in principle, you try to imagine the effort needed to understand that code, particularly for someone who hasn’t written it. But, uh, as we all know, as all programmers know, even your own call, if you don’t look at it for a couple of months, it’s basically as if it was written and by someone else, you, you, you, you, you quickly forget the mental model that you, that, that brought you to that piece of code. And therefore you basically have to start from scratch as if someone else wrote that piece of code. And the amount of effort you need to put into understanding that code is, is called readability is, is related to it’s readability and the readability. I think it’s a very interesting quality, very different from, from, from the first two we talked about.
Marco Faella 00:30:22 So time and space, just because it’s so hard to measure at the same time, it’s also very important. And the reason it’s very important is that this a readability is in turn is a proxy for maintainability. So maintainability means how easy it is to modify, to fix you extend your program. And of course, every commercial software needs to evolve in time and it’s to be, uh, it needs to have bugs bug fixed. It needs to be extended and so on. So, uh, having a code that is easy to read and understand is a good proxy to its future maintainability. The easier it is to read, generally speaking, the easier it is to modify it without introducing new bugs, because you understand it because you manage to understand it and you know what you’re doing on the other hand, I know it is that it’s at the same time.
Marco Faella 00:31:24 I think it’s very important as a quality, but at the same time, there’s little effort in academia and in general, in schools, in programming schools towards this objective, that’s right. Very little effort in teaching readability. And once again, that’s because it’s not a fully developed topic. It’s not yeah. Easy to measure and it’s not, there’s no well-developed theory of readability. So what we have are examples where we have our guidelines. And in fact, that’s what I focus on in the book, because that’s the best we have right now. And I’m thinking about, uh, the famous refactoring rules by Martin followers and, uh, Robert Martin’s clean code guidelines. So those are practical. Uh, time-tested the guidelines that are very successful in practice. They they’re, they don’t constitute a, uh, an overall theory. Okay. It’s a set of guidelines. So they work well in practice and a lots of people agree with them.
Marco Faella 00:32:33 And, uh, I think they’re very valuable, but we don’t have a systematic theory that I know that there’s people to measure. There’s people in academia trying to actually measure readability. Sometimes I even read like scientific papers about, uh, taking brain scans of people, reading code to try to scientifically measure the effort they’re putting into understanding it. But I don’t think this is like extreme attempts are going to, to, to, to solve a problem. I think we have, I think we need the, as a discipline, we need time to figure this out, to establish a consensus over what’s readable code. And also, I think at some point it becomes a little subjective, so that at some point you cannot have a, you cannot have a hundred percent agreement on the rules of, uh, of, uh, readable software. At some point it becomes a subjective what’s the better style.
Felienne 00:33:37 Yeah. I think you’re making lots of interesting points there that I want to follow up on. So yeah, that definitely, of course it isn’t possible to measure the readability of codes in the same way as space efficiency, as time efficiency, so maybe different things would be needed. And I think it’s really interesting that you said as a field, we don’t really have a good theory for readability of code. I think I agree with you there, but I’m wondering how would, should your theory look like what are the boundary conditions of such a theory? What is, what is the thing we’re missing there from resources? Add like a library refactoring. We have a bunch of things, so what are we missing?
Marco Faella 00:34:19 Right. So let me mention a couple of more things that we do have that, of course you’re well aware of, but let me mention them nevertheless. So we have a lot of proxy measures, like cyclometric complexity method length simply, you know, um, we, we say that the methods should be short no more than, I don’t know, 20 lines, something like that. So lots of people, uh, have proposed different metrics. So it’s numbers, it’s something easy to measure. And then they’re trying to connect these numbers, this quantitative objective measures to some other independent measure of readability, like asking someone after he, or she has read a piece of code. So they ask them, uh, how easy was it to understand, or maybe they ask them, uh, what, what does this code do, please, please describe it or answer this question about the code and so on.
Marco Faella 00:35:20 And they correlate these measures to the answers, to do, to the questionnaire, to the answers given by the person. I think that’s a very reasonable way to proceed. And I know that too, we have some results, even though, you know, you’re comparing something objective and measurable. Okay, that’s everyone agrees. What’s the psychosomatic complexity of a given function. Okay. You just, uh, count the number of nested loops, nested conditionals and so on. But on the other hand, you are comparing it with the subjective judgments made by people, which means that, uh, sometimes, uh, one paper will we’ve claimed something. And then another paper can claim almost the opposite because it asks the questions in a slightly different way to their subjects, to the, or maybe the people answering the questions are different, have a different level of experience. So it’s very hard to reach a consensus this way, but I don’t see any other way to move forward. I think it is the right way, but, uh, we need time to establish like a standard protocols for these experiments. Then maybe we can move towards a consensus about what are the best metrics, what metrics correlate to the best with actual irritability, with people people’s efforts to understand.
Felienne 00:36:45 Yeah. Or if I may offer an alternative theory or an alternative suggestion, maybe we should accept that these two qualities are like diamond space efficiency. And on the other hand, readability are just different things. And the philosopher almost has this idea of incommensurability that is just impossible to measure two things with the same framework. And I think maybe one of the mistakes that people make here is that they try to reduce readability to also being a number, whereas an alternative way could be just to, to accept the fact that some things can be measured in a number and other things are vague. Like, is this art piece nice? Well, it depends on a number of things, colluding personal space. So we can, we can try to make the one more numeric or we can try to accept that maybe it isn’t very numeric and to move on to another quality. You also have the quality, reliability, and reusability might suffer from the same issue as readability that’s when usability is also hard to quantify. So again, I want to hear what your perspective is on what is reusability and how do we, how do we measure it? True. Do we measure it? And if we want to, how would we do that?
Marco Faella 00:38:05 So by reusability, perhaps, uh, also generalities a is a good word for this, for this quality. I mean, how easy is it to take your implementation? And once again, I focus on small units. So let’s talk about a single class, how easy it is to pick your class and put it in a, a partially or completely different context in a, in a different system and make it useful. There may make it useful, you know, in another, another. So, uh, my perspective on this is that, uh, focusing on this quality, first of all, is simply a great opportunity to learn something. So what I mean is, uh, it gives rise to very interesting programming exercises for, for people who are learning. For example, you, you can tell them, pick, pick some class that you, that you, uh, designed and turn it into a library. That’s, that’s the most reusable.
Marco Faella 00:39:05 That’s what we call completely reusable code. We call it the library. That’s, that’s the code that’s meant and intended from day one to be, uh, put in completely different contexts and still be useful in different contexts. So I think, I think that’s a very useful exercise for, for people that are learning to program and because what do you do? And, uh, and this also happens in the book is that you take this class, it has some data, of course, you have to pick the right class. You have to pick a class that’s amenable to being generalized. But if you pick a, if you pick a good class, then this class will have data. Of course we’ll have operations methods. And if you think about turning it into a library, you’ll probably want to generalize the kind of data it supports it contains. And then of course you want to generalize the operation.
Marco Faella 00:39:57 So what you end up doing is probably a start from some concrete data types that are in the class, and you’re probably going to replace them by type parameters. So you’re going to use generics in Java terms, there’s templates in C plus plus language and so on. And then you have to generalize the operations. And that’s also a very useful exercise because you probably end up using interfaces, uh, complex interfaces to represent what this abstract data type is capable of doing. And therefore you end up using what are in Java called the bounded type parameters, bounded, wildcards. So this pretty advanced language features that you don’t get to use every day in this kind of, uh, of context, they will come handy. I think also C plus plus is adding this, this year C plus plus a 2020 should add something called traits or concepts maybe, or they’re called traits or concepts.
Marco Faella 00:41:03 I’m not sure that’s, that’s related to this. That’s a, that’s a new kind of obstruction to generalize to have a very expressive way of generalizing operations to multiple data types. So it’s, it’s something that leads to, uh, using interesting and advanced the language features that that’s useful for, for learners, whereas in practice on the job, I think that, uh, the, um, the situation is quite reversed because in practice, I think programmers are too keen to generalize their software. So they’re known most programmers are known to over-generalize their software, whereas in practice on the job, when you have little time to achieve your goal is perfectly fine to write your context specific class in, in, in a rather non, non reusable way. So you should distinguish a theory and the learning stage from, uh, in this case, I mean, in most cases, uh, uh, with, with reusability, I would, uh, be careful about pushing reusability in practice, like on the job, unless of course you are actually developing the library, then, then that’s your primary goal for usability. It should be your primary goal.
Felienne 00:42:25 Well, actually I really liked this, this frame of thinking. I think it’s a really great question. You could ask the students, but also people went in and company, if this would become a library, what changes would you make now? And just to get people in the mindset of reusability, this might also enable reuse. If people would write their code in a more reasonable way, then maybe more reuse would happen. This, I mean, this is, I think for me now already to highlight from this interview, because now I’m thinking of my own code base. I’m like, Oh, but what if part of this would be a library now? What changes would I make? So I actually think that’s a really, really good way. And also a concrete way of,
Marco Faella 00:43:04 I think that designing libraries is a lot of fun for experienced programmers, but it may be a nightmare for, for junior programmers.
Felienne 00:43:13 Yeah. But then it is a nightmare that, you know, they need to get used to you at one point because you are going to release something either as an opensource library or that there will be someone in your company that says, Oh, you’ve done this before. Can I, can I look at your code or kind of reuse your code? So it is, it is good to practice it because it might occur. It’s likely to occur later in life.
Marco Faella 00:43:33 Once again, when I say Mamia a nightmare to, uh, to junior developers, once again, I mean, it’s not their fault. I mean, it’s our fault as educators. I think that’s, that’s another area where, uh, in my experience in academia, we don’t put, uh, enough, uh, enough stress and a focus on designing good API, which is a crucial part of designing a library. We focus a lot on, uh, designing algorithms and implementing them. That’s great then, and on good data structures. It’s also great since then. And, uh, but, uh, I think we focus too little on API. So in a larger scale issues or usability issues,
Felienne 00:44:20 Yeah. We have two more qualities left, but this was actually one of the questions I still have on my list. So let’s, let’s zoom into this a little bit more because why don’t we teach that? I mean, you’re a professor
Felienne 00:44:32 And professor, we have limited power to change this. Why do you think it is that most universities also like, remind university, why don’t we teach this? Like, is it hard? I don’t, I don’t even know myself. I pay more attention to this
Marco Faella 00:44:51 A little time. Right. So, so we have to make a choice. And, uh, I think a lot of it is tradition. So some, some areas like algorithms and I mean, I’m totally, uh, fond of algorithms, algorithm theory. And I’ve also worked a research wise on different types of algorithms. So, uh, I completely appreciate, uh, the area and all the rhythms and data structures. They like, they have a, first of all, they have a longer, uh, somewhat longer history than a software engineering. This, this, this other concerns we are talking about like good API and reusability. They, they would be called software engineering topics and software engineering. This is a little younger than, than the more established areas like algorithm and data structures. I would say because software engineering came with large systems and large systems took, uh, more time to be developed than a single algorithms. So there there’s, there’s tradition, there’s areas that have a stronger tradition than, and in those are, those areas are more mathematical and therefore they have a more developed Feehery or overall organic theory. And that’s, uh, that’s somehow easier to teach and also easier to test.
Felienne 00:46:22 Yeah. Yeah. I definitely think that testing has something to do with it. It’s also easier to, it’s easier to create examples for algorithms. Like here’s a two by two matrix, and now you have to multiply it with something else, or here’s a, here’s a maze and you have to find the shortest path or to the best way to do.
Marco Faella 00:46:40 And everything is more, yeah. Everything is more, it’s simpler in the, in the sense that it’s easier to compare to algorithms. You, you, you have all the structure to, you have a Asim political analysis. It’s precise, it’s mathematical. It gives you a S uh, it gives you an exact answer to the question which of these two algorithms is better. Whereas if you’re asking which of these two APIs is better, that’s a completely different matter.
Felienne 00:47:09 It comes back to the measurability that also when the, when the students had proposed a solution, we can say, you don’t have to do this within five seconds. Otherwise your answer’s wrong. Or you can just use 10 megabytes of memory. Whereas with readability, it says, Ray, harder to judge, and maybe the students has really good arguments to do something in certain way. And then
Felienne 00:47:30 Sometimes you might even Vince. I mean, I’ve had discussions with coworkers where I was really like, why would you do it like this? And then they say, well, because of this or this, this, and you’re like, Oh, well, this is not my opinion, but this is actually also a reasonable standing, really have to do with this measurability as well. Or it’s just really hard to judge if a state is doing it. Right. And then in addition to, of course also did things you mentioned letting you go to your ethical framework, probably lacking enough faculty to have enough understanding of this, to teach it and a short history. Maybe there are probably contributing to this.
Marco Faella 00:48:07 I agree.
Felienne 00:48:08 Well, luckily there’s your book. And then if students have graduated and they still miss some of these qualities, we can say, well, you should read this book. And then let’s come back to the book of base, because we also want to talk about the other two qualities that we haven’t talked about. And one of them reliability. And I think reliability is interesting because I think it falls a bit in between the things we’ve discussed, because in a sense, reliability is measurable. And you can say this server is up for 99.9% of the time, but it’s only on hands. It is harder to predict because we can’t really calculate how, how often the server is up because you know, it might go down tomorrow for 10 days and then everything is different again. So reliability is in the middle. So, so what do you consider to be reliability? And, and please give us another juicy anecdote who are stuff around, because we love to hear that.
Marco Faella 00:49:04 Yeah. So yeah, I agree with your assessment. Is it somewhat in between because, uh, you can, uh, basically you can only measure reliability after the fact, but that’s kind of too late. You you’d like to measure it beforehand. And once again, you have, uh, some proxies here, some proxy measures that you can take beforehand. For example, a testing coverage is, is the measure that, uh, that comes to mind. That’s a proxy that, that you can measure before deployment that it’s a rough indication of, of reliability is one of the few things you can measure that, uh, related to reliability, at least. So in the book I checked to two different points of view. So I have two chapters about reliability. One is a, is called reliability through monitoring. And by that I basically mean a what’s commonly known as defensive programming, or rather an expanded form of defensive programming because usually by defensive programming, you mean, uh, checking preconditions of methods or runtime, basically, that’s, that’s the basic, that’s the basic version of defensive programming, but that’s only a part of a, of a larger framework.
Marco Faella 00:50:14 That’s called the design by contract of a more general idea. And, uh, within this idea, you also have environments class invariants and you have post conditions. So if you’re really about, uh, reliability, and you want to achieve that through monitoring. So through online checking that everything is going fine, then you may want to also check invariants while the program is running. And you may also want to check post conditions while the program is running. So every method check that it’s called with the right parameters, that it’s pretty condition. Then it checks that the, the current object is in the right state before it starts, it’s checking the invariant. And then after the method finishes, it’s true. It also checks that it’s, that the method itself has done its job properly, which sounds kind of crazy because you’re, you’re riding both the methods and the procedures, the checks that the Methodist.
Marco Faella 00:51:17 Correct. But, but sometimes it, sometimes it can be useful, but it’s very expensive in terms of, uh, both time and space most of the time, if you want to add all these checks. So this kind of, uh, of extreme defensive programming is you served for a special context where, uh, let’s see safety, critical applications. It can be a software for an industrial plant, a software for the automotive industry, a health industry. Those obligations were a bug could be catastrophic, could lead to a, uh, loss of life or loss, huge amounts of money, huge, huge damage. And in this respect, since I’m talking about safety, critical software, let me also mention that you can, uh, speaking of, uh, there’s a reliable, the techniques, you can also have one step further, which is a static analysis and the software verification that those are techniques that I’ve worked on research wise, where you actually analyze the source code without running the software.
Marco Faella 00:52:30 And you try to prove formerly mathematically certain, certain properties. And that’s something I only briefly mentioned in the book because it’s very niche, specialized yeah. Area. And then in the book I moved through testing, which of course is the main, uh, validation technique everyone is, is using because it’s the most effective, practical the way to, to catch bugs. And, uh, once again, as with all chapters, I have a short, uh, page span. So, and, and, and, and testing and has since, yeah, it’s so widespread, there’s a huge literature on testing techniques, tips, and tricks and so on. So my chapter is just a brief intro action to unit testing techniques, because I’m always focusing on a small unit of code. And then I point to a deeper and wider treatments and other books where you can, uh, you, you can get the details.
Felienne 00:53:36 Yeah. But I think what you mentioned, he was really interesting that if you take really good care of reliability, that comes at the cost of diamond space, efficiently being greased with all of these runtime checks, whereas you might be more efficient Jack, less, and depending on your domain, and as you mentioned, healthcare might be something where you really want to have that reliability. Whereas a game, for example, there a space is more important. And then, okay. Maybe sometimes it’s crashes and that’s too bad, but that’s a whole different situation. So it’s interesting again, how all of these are all, they all have trade offs with each other.
Marco Faella 00:54:11 Exactly. Yeah. That’s, it’s, there’s trade offs everywhere. That’s computer science.
Felienne 00:54:16 Yeah. And then the final quality that you mentioned in the book, somewhat surprising to me, because it’s so detailed, it’s stretch safety. So readability is like the super generic concept that you can apply to almost anything within a project yet it’s the readability is not even limited to go. They also holds for documentation and an API is, as we’ve talked about, but tread safety is so specific. And I would say someone’s related also to reliability. So why did you decide to tread safety its own quality rather than integrated it into reliability?
Marco Faella 00:54:51 Right. Yeah, you’re absolutely right. So thread safety stands out, uh, as a little different than, than the others. And, um, first of all, as you say at first sight, at least it’s a, it’s a more specific the than the others it’s like up until a few years ago, it would have been considered a really a, a very specific quality. But these days, as you know, uh, more and more software is inherently concurrent. And that’s because of a hardware development where every machine is concurrent. These days, it’s actually parallel these days and computers are not getting faster and not getting much faster in a single core point of view. So more and more software is becoming concurrent. So I think thread safety is going to become more and more important and shouldn’t be considered almost as important as, as, uh, those other qualities. But of course, it’s, it’s a part of reliability.
Marco Faella 00:55:54 So threat safety is about, uh, uh, having a class able to, uh, having your class, uh, able to be used by several threads at the same time, without a data racist, which, which means without unexpected behaviors, without getting into an inconsistent state. And also I have to admit another reason why I put it there it’s because it fits very well with my recurring example. We haven’t talked about, it’s a, I’ll just say a few words. The recurring example is a, is a simple, very, I hope very intuitive, a water container example. You have a system of water containers. You can pour water into one containers into one container and you can connect different containers with the so-called pipes, uh, with imaginary pipes. And when you pour water into one container, this water that you pour, uh, should, uh, disperse equally among all containers that are connected to these one among all containers that you have previously connected to these one with pipes. And that, that I think turns out to be a very interesting example and, uh, it’s very tricky to make it a thread safe. So I have taught, made another reason why I chose this particular quality besides the fact that it’s becoming more and more important every day is that it fits very well with this example, because it’s very tricky to make this water container class thread safe. And it allows me to talk about different, uh, synchronization techniques, including lock free synchronization, which is a pretty advanced topic.
Felienne 00:57:39 So the final thing I want to talk about is, and this is people that listen to my episode, it’s all from, they know this topic always comes up and that’s the topic of different programming languages. So you mentioned this quantities and the tips in your book, are they as applicable in, let’s say Hasko as they are in no JS, or is it the case that they fit a little bit more with certain languages or paradigms than they fit with us?
Marco Faella 00:58:06 Right. So on the positive side, I would say that the overall message of the book is completely general. And the overall message, as I said several times is about being aware of this different forces that pull your softer in different directions, being aware that you’re always choosing the tradeoff. You cannot optimize all of them at the same time. There’s there’s trade offs as usual in computer science, you have to pick a trade off. But before that, you have to be aware of these different dimensions of quality. And this I think is completely general and could be useful to someone programming in any language, the specific techniques on the other hand that I, uh, describe in the book are mostly about object oriented programming languages. So not just Java, but certainly C sharp because that’s, as you know, it’s very similar to Java. And in fact, I have little sidebars here and there in the book relating my Java techniques to the, there C sharp counterparts, but with a little effort, most of those techniques can be used and applied as they are in. I would say any object language, if you move to a completely different paradigm, if you move to Haskell or functional programming, or I don’t know, logic programming, then of course the qualities are still there, but the techniques you use to optimize those qualities to improve those qualities are quite different.
Felienne 00:59:39 I know you think some programming languages also make inherit decisions of some qualities over other qualities.
Marco Faella 00:59:47 Yeah, sure. Um, different languages are, uh, um, useful different things, which means that they emphasize the some, uh, quality over others. For example, Java and C sharp, which is the two languages that are, uh, explicitly mentioned in the book they were born with. Uh, one of the main design objectives of those languages was simplicity and ease of learning. So, which translates to having a pretty good readability and also a pretty good reliability, for example, just because they are garbage collected, that’s somehow improves 10. It tends to improve a reliability of the code. Of course, that that was a, there was one of the design, uh, objectives 20 years ago when, when, when Java started. But of course then it evolved. And nowadays it’s not as simple as it was 20 years ago. So it moved a little bit away from its initial simplicity. Now you have all new and new staff to be, to stay modern and to keep, to, to stay modern in the face of, uh, new languages.
Marco Faella 01:01:04 Um, of course there’s kind of a competition between different programming languages on the other hand, uh, well to mention another object or into language like C plus plus C plus plus has a very different, uh, objectives, very different history. So it, uh, it aims at succinct furnace. It aims at the highest possible performance in line with its predecessor C. And so it tends to, you can achieve the highest possible performance. You can basically write software as fast as, as, as the hardware allows, but you may lose you, you lose some readability for example, and therefore some maintainability. So a simple way to look at this is to, to look at this, the difference between C plus plus and Java and C sharp is look at the amount of symbols they use. So in C plus, plus you, you have a lot more symbols. You have a lot more meanings for the symbols, like colon semi-colon or dot a lot more meanings for the symbol.
Marco Faella 01:02:10 And in fact, you even have a operator overload when you can give your plus and your minus any meaning at all that you want. So there’s lots of meanings for symbols that, that, that helps succinctness, but it tends to decrease readability because you, you read a plus B and you don’t know what the plus means. You have to go figure out what the plus between those two object means. That’s a simple example. And then of course, a there’s a thousands of languages focusing on different things like, uh, most functional languages have very strong type systems. And those types systems tend to, uh, improve reliability. Once again, some people say that if you’re a functional program compiles, then it’s quote unquote. Correct. So that’s the experience of many people love programming in functional, uh, languages. You know, the, you have your type system is so powerful that your compiler can check a lot more things than, uh, than, than, uh, an object oriented, uh, compiler. And so you have a lot more checks done beforehand by the competitor and a lot fewer runtime errors. And of course you have a more, um, specific languages like, uh, let me mention just one example, like go, Google’s go language, which has some special machinery for concurrency. So it certainly helps make a thread safe software, for instance, because it has this special communication channels between, between threads that help avoid risk conditions. And so on. Now, of course, there’s, the list goes on and on.
Felienne 01:03:58 So to summarize the tradeoffs between these qualities are language independent and they always exist. However, if you’re choosing a language, then one language minds value a certain quality, like go might value something and hashcode might value something else. So some languages might value, speed. Some language might value readability. So when choosing a language, there’s definitely differences. But once you’re in a paradigm, there will always be this trade off between space and reliability. And these things are almost always at all Swedish.
Marco Faella 01:04:30 Yeah, indeed.
Felienne 01:04:33 Great. Well, I think we covered everything I wanted to ask. Do you think there’s anything we missed? Anything that you wanted to add to this?
Marco Faella 01:04:40 No, I think it was a great, I really enjoyed our discussion. It was great. Yeah. I think it also helped me clarify certain points.
Felienne 01:04:50 Great. Well, thanks. We’ll make sure that as we put the link, of course, through the book in the show notes, so people can check it out if they want to. Are they, are there any other places on the internet where people might want to follow you or keep up to date with things that you’re doing?
Marco Faella 01:05:04 I mostly use Twitter and my homepage. So if people Google my name, they will, uh, get, uh, pretty quickly to my homepage and then they can also find my Twitter handle. Yeah. And we can
Felienne 01:05:18 Also add those to the show notes. So your Twitter hat outside, we can make sure that they’re linked with episodes so people can find it easily. That’s great. Well, thanks for being on the show. Thank you very much for having me. This is Felina for software engineering. Radio
[End of Audio]
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected].
SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)
I very much appreciate the example of quality of being thread safe … this shows that qualities are not orthogonal. It is not spelled out clearly in the interview, but this example shows that you are allowed to invent your own qualities and try to improve them.
Typical example for this from Europe: The first time we included people from a neighboring country into our team, we found a few locations where we had not used locale settings but just copied text. Our neighbors had the characters for thousands separator and the comma the other way round, and we had not been aware of this … From then on, this quality became important.
And there are infinitely more “qualities”.
I really enjoyed listening to this episode! It targeted a lot of points I have encountered in my daily work to be important when it comes to “seriously good software”. 🙂
When the talk reached software engineering (ca. minute 44) the same question came up I have for long time: why is software engineering not thought more extensively at the universities? When I’m looking back on my work since I have left the university most of the effort goes into software engineering and a very little into algorithms (<1% probably) and data structures (<5% probably). Most of the time it is about organizing software and teams, decomposition and composition of software systems out of not well decoupled and delimited components and libraries, with dependencies everywhere, which makes it complex to reason about over time.
Some years ago I came across an idea by Ralf Westphal (https://ralfw.de/) how to approach the dependency problem and system composition in a generally different way. He postulates the principle of mutual oblivion as central principle which leads at the end to a new architectural approach solving the dependency problem as far as possible. We are applying this approach quite successfully in our daily work, already.
Here is an article I wrote, as an intro and with references to Ralf's articles: Go beyond object oriented design — Let it flow! (http://bit.ly/2yJBnzX)
But maybe, it is worth to get an interview with him in SER about his idea and his unique approach. And maybe, this could also be an approach for teaching software engineering…