SE Radio 637: Steve Smith on Software Quality

Steve Smith, founder and principal architect at Nimble Pros, joins host Jeff Doolittle for a conversation about software quality. The episode begins with a discussion of why software quality matters for businesses, customers, and developers. Steve explains some patterns and practices that help teams design for quality. They discuss in detail the practices of testing and quality assurance, and the conversation wraps up with suggestions for fostering a culture of quality in teams and organizations. Brought to you by IEEE Computer Society and IEEE Software magazine.

Show Notes

NimblePros
Refactoring to Patterns by Joshua Kerievsky
(“On The Criteria to be Used in Decomposing Systems into Modules” by David Parnas
“New is Glue” blog post by Steve Smith
Code Complete: A Practical Handbook of Software Construction, Second Edition by Steve McConnell
OWASP OWASP Foundation, the Open Source Foundation for Application Security
Lean Software Development: An Agile Toolkit by Mary Poppendick
W. Edwards Deming (Wikipedia)

From IEEE Computer Society

“Research on Software Quality Assurance Based on Software Quality Standards and Technology Management”
Software Quality: From Theory to Practice
Increasing Software Quality through Design Reuse
Can We Really Achieve Software Quality? (IEEE Software)
Open Source Software: Communities and Quality (IEEE Software)
Quality Metrics in Software Architecture
“Software Quality Assurance” (Computer magazine)

Related Episodes

Transcript

Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was psychometrically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.

Jeff Doolittle 00:00:33 Welcome to Software Engineering Radio. I’m your host Jeff Doolittle. I’m excited to invite Steve Smith as our guest on the show today for a conversation about software quality. Steve is founder and principal architect at Nimble Pros. Steve has over 20 years of experience building custom software solutions using Microsoft technologies and speaks internationally to software developers about ASP.net, domain-driven design, design patterns, solid programming principles, and how to improve quality through refactoring. Steve, welcome to the show.

Steve Smith 00:01:04 Hi. Thanks for having me.

Jeff Doolittle 00:01:06 So glad you’re here. I love that your bio mentions quality as something that you’re passionate about, which is exactly what we’re here to talk about today. And so, from a high level, we’re going to talk about how quality relates to design, to testing, to culture and other key aspects of that concept. But I want to start with asking you, why does software quality matter and how did you come to care about it?

Steve Smith 00:01:27 I started caring about software quality because I wanted my software to be correct. I wanted it to work and not have too many bugs. And early on in my career I was building a product and it was an ad server because we had an advertising network of developer websites and I wanted it to work, right? I wanted to serve the ads properly and not crash and not slow down other everybody else’s websites. And so the idea that if there was anything I could do to make it so that it was right the first time and I didn’t have to go rebuild it or revisit it, was very important to me. And we’ll talk about this in a moment I’m sure, but one of the things that I realized very early on is that if you can automate the quality checks of writing automated tests, that’s going to get cheaper and cheaper because of Moore’s law over time. Versus if you have to have manual testing, that’s going to get more and more expensive because developers and even QA personnel tend to be more and more expensive over time. So it just made sense as a business owner for us to focus on that type of building quality in approach.

Jeff Doolittle 00:02:28 Absolutely. So why do you think some people sometimes may be resistant to quality related practices?

Steve Smith 00:02:36 I think mostly it’s the idea that you can go fast and the whole like move fast break things. Thing from the west coast I think has to do with some of that. Like if you move fast and just make a mess, then eventually that’s going to come back to haunt you, right? This whole technical debt metaphor we could talk about where you’ve left this big mess and it’s either going to get in your way or you’re going to have to spend even more time cleaning it up because it’s such a mess than it would have taken to just kind of clean things up as you go. And there’s imagine that instead of building software, you’re like running a kitchen or running a machine shop. Like if you just left a mess everywhere, it would make it much more difficult for you to serve a quality product. And that happens in our industry too.

Jeff Doolittle 00:03:17 Absolutely. So where does quality start in the software development lifecycle in your experience?

Steve Smith 00:03:22 I think it’s important to have a really good idea of what you’re building. Something I like to say is that as software developers we fail in two ways. Either we build the thing wrong or we build the wrong thing. And building the wrong thing is incredibly common and generally more expensive of a mistake than just implementing something incorrectly.

Jeff Doolittle 00:03:41 Absolutely. But as far as where it begins, I mean, are we talking about when you start coding, are we talking about when you start putting issues in the backlog or are you talking about like where does quality actually should it be entering into the minds of people who are building software?

Steve Smith 00:03:55 Yeah, so to make sure that you’re building the right thing, I’m thinking it goes to, even before you enter things in the backlog, it’s this conversation about what is the overall product that we’re trying to build? What is the problem that we’re solving for users? Lead with that, make sure the developers understand that this is why you’re building this, this is the problem it’s solving. Many times developers don’t have that connection. They never talk to an end user or they don’t even have any idea of how this thing is used. And there’s plenty of stories about developers that build things that work great for them, but don’t necessarily work for their users, right? Maybe they’re building a web application and they’ve got a huge desktop monitor and it works great there, but most of their users are on their phone and the developer never tested on a phone. Things like that where you really want to understand how it’s going to be used and the problem it’s going to solve even before you think about how to write the code for it.

Jeff Doolittle 00:04:44 Do you have some examples you can share? Maybe one example where you’ve seen a lack of attention to quality that’s negatively impacted the ability to deliver some systems?

Steve Smith 00:04:53 I see this pattern a lot with clients of my company Nimble Pros because this is one of the things we focus on is helping customers that are in this position where some founders build a system, either they’re self-taught developers, or a lot of times they’ve outsource development to oftentimes offshore teams that are able to quickly build software. And for the first year or two or even three, they’re cranking out code and they’re getting new features and they’re starting to get some customers and they’re starting to grow. And by year four they notice that everything is slowing down and they’re in firefighter mode all the time. And at first they attribute it to, oh we grew too fast, we have too many customers maybe we need to focus more on DevOps or infrastructure or things like that. But even after attending to that in many cases, and I have a few customers like this right now, things just get so slow that they can’t ship new features.

Steve Smith 00:05:40 And the reason is that the developers went super-fast, but the reason why they were fast is because it was copy paste, copy paste, copy paste, oh we need another feature. Copy paste that one we had, oh it needs to be a little different. Well just change this one a little bit and we don’t introduce any commonality or way to do in a single way, right? We don’t follow any of the principles like separation of concerns or the dry principle, which don’t use that religiously, but there are times when it makes sense not to repeat yourself. And so you end up with systems where like, hey we need to change how sales tax works, but we’ve implemented purchasing stuff in literally 10 different places in this website and they’re all like huge, complicated messes that are almost the same but not quite. And so now just that simple thing of changing how sales tax works becomes like a three-month project where it would’ve been an hour in the first year of the systems development. So that’s super common.

Jeff Doolittle 00:06:31 But Steve, someone in a suit will say we never would’ve gotten to year four if we had thought about these things in year one.

Steve Smith 00:06:37 I think that’s myopic, myopic. I’m not sure.

Jeff Doolittle 00:06:40 Yeah, Well I agree, but I’m just saying what somebody will probably say. Right? So how do you respond to that?

Steve Smith 00:06:45 There’s a mantra that if you go slow, it helps you go fast, right? So like moving deliberately instead of rushing around is the better way to have consistent progress. One of the things that tests will do is keep you from going down rabbit holes, trying to chase bugs that you didn’t even know you created in parts of the system that you didn’t think you were affecting. Catching those regressions very quickly. And it doesn’t take long for that to pay for itself, right? And so if you work on building quality early on, it’s like having a ratchet on a strap or as you’re climbing, right? Every so often you’re climbing up a mountain or a rock wall and you tie into that new spot. So if you fall, you only fall down from that spot, you don’t fall all the way to the bottom.

Steve Smith 00:07:26 Right? And so does that take you more time than just free climbing the wall? Yes. But does it make it so you’re much less likely to die as a result? Yes. So these are the things where it’s a risk reward calculation and sure some startup founders might say we don’t need that because we just want to get market share and prove that we have a market and there’s something to that. But if you have developers that have the skill to build software with tests or with quality, I believe, and I’ve seen it, that they can go just as fast or even faster writing high quality software than they could just cranking out some stuff that they consider themselves to be low quality.

Jeff Doolittle 00:08:02 Yeah. And that speaks to culture, which we’ll get to in a little bit I believe. Because when you start talking about the kinds of developers you have and what their capabilities are, that’s obviously going to have an impact on how much you can deliver quality software.

Steve Smith 00:08:14 Yeah, definitely.

Jeff Doolittle 00:08:15 Before we get to that, let’s shift gears a little bit and let’s talk about design, which is something we mentioned before we started recording is designing for quality is something that people have to consider. So maybe speak a little bit to what that means in your experience and then let’s talk a little bit about how principles of good design improve software quality and maybe what some of those principles are in your experience.

Steve Smith 00:08:37 Sure, there’s pull books on principles for quality. One of the ones that I use a lot is separation of concerns. And that can mean different things. Like it’s kind of vague what are concerns, but in many cases it’s technical concerns. Like you want the UI and say persistence to not be coupled to your business logic. In a lot of legacy applications, one of the ways that teams might go faster is just put a whole bunch of logic in store procedures because they’re easy to change and you can even change about runtime and you don’t have to worry about that pesky source control thing in many cases, right? But that eventually can come back to bite you. So, keeping business logic out of the database will allow you to more easily switch up databases. And there’s plenty of folks that’ll be saying, oh that never happens.

Steve Smith 00:09:18 We never changed our production database. Well sure, but you might want to change your database in different environments, right? When I run it on local host, maybe I just want to spin up a cheap SQL light database or something and when I run it on a staging environment, something else in production, something else. And so, having that flexibility makes it so you unlock a lot of different ways that you could run the application in different containers or different scenarios in your build pipeline, et cetera. Different regions. Yeah, yeah. In different regions, separate concerns is huge. And then the other flip side of that if you think about a lot of developers when they think about the structure of a system, you think about these horizontal layers. If you visualize like a user interface layer on top and then it talks to like a business layer of sorts and then some kind of data access layer, right?

Steve Smith 00:10:02 Well then the other side of that is these vertical slices where a feature is a vertical slice through that whole thing. Keeping those as separate concerns also from one another and organizing your code so that pretty much all the things you need for a given feature are organized in a way that they’re easy to find and located near to one another helps as well. The solid principles I think are often a good guide. Things like classes not being too big. So having only one responsibility, the open closed principle so that you can change the behavior of something that I haven’t actually changed the code that leads to a practice that I really like to share with developers, which is when you’re working on big maintainable legacy systems or unmaintainable legacy systems, right? One of the ways that you can make them more maintainable is to only add new classes with new code in them.

Steve Smith 00:10:50 Don’t go in and change thousand line long method and make it 1100 lines by adding another conditional to it. Create a new class that has the new condition in it and instantiate that and call it maybe from that thousand-line long method. But now you’re not making the problem worse right? Now you’re, you’re starting to make it so that things are, are smaller and more easy to follow. And that new class, maybe that one’s unit testable even if that other big method wasn’t. And so open closed principle is about that. It’s about being able to create new implementations of things rather than having to do surgery on stuff that’s already there. And then the dependency inversion principle is huge as a way to keep your code from being too tightly coupled to its infrastructure. Code that’s tightly coupled to infrastructure is often extremely difficult or impossible to unit test. And so you have to resort to integration tests or manual tests which are way slower and way more expensive. So yes, you still want to do some of that for sure, but if you can unit test all your business logic without having to rely on infrastructure, you’ll go a lot faster and you’ll have a lot more confidence that your business logic works correctly.

Jeff Doolittle 00:11:52 Absolutely. I call that separating infrastructure from implementation, which I think is what you’re saying. And then you can touch your business logic directly and get a lot of value out of that. And to your point, you do want to test the infrastructure and do full integration tests, but if that’s all you have, you’re going to have some issues, right? Any other principles of good design that’s, so far, you’ve mentioned separation concerns. Talk about solid principles anything else you’ve seen in your experience that are principles that have improved software quality?

Steve Smith 00:12:18 Don’t repeat yourself, which I mentioned earlier, the dry principle which is very similar to once and only once, which is older, extreme programming principle. But the idea that things that should have only one rule or one setting in your system should only live in one place, should only exist in one place. So that, if you need to change it, you only have to change it in one place. You never have to go change like, oh I made this change to something, now I have to go touch 20 different files. Like that change the fact that you’re having to go do shotgun surgery on your system and change it in 20 places is telling you that hey, you really should have just put this in one place and had those other ones all reference it somehow.

Steve Smith 00:12:55 And that reminds me of new is glue, which is a blog post I wrote a long time ago. But the idea that in your code you’re newing up some dependency, you as a developer every time you do that, say in your head new is glue because you’re gluing your class to that specific implementation of that other class. And so much of the time that’s not really what you want. You want to be loosely coupled to that thing and if you use dependency injection or some other technique right, then you can keep that coupling looser and not glue your code to that other code. So I definitely use that one a lot as well. I will point out that you can’t just dry all the things because every time you do remove duplication, you introduce coupling. You had these 20 different things all had a copy of the same code at that moment.

Steve Smith 00:13:38 They were all not coupled to each other. They could all evolve independently and maybe they should maybe that’s the situation where they should all be separate from one another. And if you make them all the same, you introduce a method that they all call now. Right? Then the challenge might be that a week later one of them needs to change and since you collapsed it all into one method, how are you going to do that? Well we’ll add a flag to the method and say, well if it’s this case do this other thing and now you’re got this tightly coupled method, everything calls, it starts to become this complex mess of if logic like no, take the one that needs to be separate and distinct and don’t have it call the method anymore, right? And just have the special code that it does be in that one place, not commonly with all the other ones.

Jeff Doolittle 00:14:16 Yeah so a couple things have come up here, which I think are even more abstract than some of those design principles, but I think they’re a good starting point. You’ve mentioned coupling is one and I think sort of inherently cohesion has also come out of that as well. And also to do a lot of what you’re describing requires people to consider encapsulation. So let’s talk about those a little bit and how they relate to quality coupling, cohesion and encapsulation. What happens to your software when you don’t consider those things and what impact does that have on quality and when you do consider them, why does it improve quality?

Steve Smith 00:14:45 Sure. So a really good book for coupling and cohesion that’s kind of a classic at this point is Steve McConnell’s Code Complete from back in the nineties. So I would recommend that for folks that haven’t read it, coupling refers to how things call one another or how they reference one another. And you can have loose coupling or tight coupling and your system is going to have coupling, right? It has to if it’s going to do anything useful. And so it’s not a matter of like coupling is bad, it’s like you want to make conscious decisions about where it makes sense for you to have tight coupling to certain things and, loose coupling to others to the extent that you can. You want to have your system be loosely coupled to the infrastructure and the environment in which it finds itself running, right? 20 years ago we weren’t anticipating docker containers as a thing, but now they’re commonplace, right?

Steve Smith 00:15:29 So having that ability to ship around executables in a stable environment using containers is an example of something where having loosely coupled code makes it really easy for you to swap out different infrastructure just as a different docker container but only if your code is written in a way that it could talk to something else, right? And if your code is hard coded to always talk to, let’s say a local SQL database, then it’s going to be really difficult for you to change that up. So using abstractions as a way to make your coupling looser and say I depend on a contract, I depend on an interface and abstraction, I don’t depend on the implementation that can be swapped out at runtime makes your code much more flexible and kind of future proof. Now you could also argue that YAGNI comes into play there.

Steve Smith 00:16:10 Like you don’t want to overly abstract things, but for basic stuff like what’s the UI concerns, what’s the data concerns, having that minimal level of abstraction I have found to be extremely valuable. Now cohesion is kind of related to coupling, but cohesion is basically how related are the things within some module or some class, right? So if we’re talking like object-oriented programming, you’ve got a class, it’s got a bunch of fields and a bunch of methods. If it has three fields that are used by two methods and a couple other fields that are used by a different other couple of methods, then it’s almost like you have two classes inside of one struggling to break apart because they’re not cohesive, right? And so if you can split those up, it’ll usually lead to a better design. I work a lot with C# and .net and for the last 15 years they’ve been mostly doing MVC, Model View Controller patterns for their web frameworks.

Steve Smith 00:16:58 And it’s super common, and I know this is true in other languages that use these patterns as well, for controllers to be super large, super wide, right? Bloated and that’s a cohesion problem, right? Like you’ll go into a controller, it’s got like 10 different action methods, and you look at the constructor for all the dependencies and it’s got like 30 dependencies that are injected. You look at any one action method, it’s using three, four, maybe five of them, but it’s not cohesive at all and you’re ending up having to inject all these things even for action methods that don’t need most of it. So splitting those up into end points is something that I do to try and make things more cohesive, more single responsibility. Those two kinds of go hand in hand. Single responsibility, principle and cohesion. And then what was the third thing?

Steve Smith 00:17:39 Encapsulation. What’s that? Oh, encapsulation is like one of my favorite topics and the idea with encapsulation is simply that you don’t know how the sausage is made on the other side of a call, right? And so its information hiding and the implementation details are not something you know or care about. And by leveraging encapsulation it makes it, so our designs are much better. It’s necessary for any kind of modularity, right? We wouldn’t be able to use plugs into the wall for electricity without encapsulation of everybody can design a cord that’s a certain thickness and has prongs in a certain orientation and if they do they can plug into this interface and get power at a way that works for them. And they don’t have to worry about, well am I talking to solar power? Am I talking to a generator? Am I talking to the grid? Am I talking to a battery? Like, doesn’t matter as long as it’s AC or DC or whatever that needs and, certain specs of that power, it’s happy and it’ll work. Our code operates under these same assumptions. So if you can code to an interface, a specification that says what you need and how you need it and not exactly how it’s done, your code is going to be more modular, which makes it more maintainable.

Jeff Doolittle 00:18:50 In my experience. There are two hard problems in software engineering, their information hiding and dependency management. And I think that’s kind of where we’ve landed here. because a lot of those design patterns have to do with how you manage your dependencies. You mentioned dependency inversion, things like that, like that design pattern and then information hiding. I really appreciate that you connected encapsulation with information hiding because a lot of people I don’t think know that they are connected. And I’ll point our listeners to the 1972 paper by David Parnas called On the Criteria to be Used in Decomposing Systems into Modules. And for 52 years we’ve had the answer to how to hide information properly within systems. But so many people are unaware of how to do it. So I’d really encourage people to spend more time looking into that. But for the purposes of this conversation, Steve, speak a little bit to kind of wrapping those back into how to cohesion, coupling encapsulation relate to quality in case that’s not really clear to listeners at this point.

Steve Smith 00:19:44 Yeah, I think one of the things that helps to demonstrate quality is to know that your code does what it’s supposed to do, right? Now how do that your code does what it’s supposed to? Well you run it and you go through the application, and it looks like it’s supposed to look, right? But that takes a lot of time and if you’re having to do that over and over and over again, like that’s eating up your productivity. So if there were a way to automate that, then you could know and have confidence that your code does what it’s supposed to do and it’s not enough to know that it does what it’s supposed to do once. Right? Every time you make a change you want to still know that it still does what it was supposed to do. So, the short answer is like having some automated suite of tests gives you confidence that your code still does what it’s supposed to do.

Steve Smith 00:20:23 Could the test be wrong? Yes. Could other things go wrong with that? Yeah, it’s not infallible, but in general, having those unit tests gives you this check and in a fast, cheap way to know that things are working the way they were supposed to at the time that you wrote those tests. And so that’s huge. I have also found that there’s a huge overlap between the Venn diagrams of unit testable code and what I consider to be high quality loosely coupled code. They’re almost the same circle, right? And so if you maybe don’t have time to write unit tests for whatever reason or, this particular code is not something that you’re too worried about, you’re just going to knock it out and you don’t have to have a hundred percent test coverage, that’s fine. But if you still write it in a unit testable way, then it’s still going to be high quality code in terms of being easy to test, easy to change, not tightly coupled the infrastructure, et cetera.

Steve Smith 00:21:13 And so later on if it becomes more complex and you find yourself scratching your head about, well why isn’t this working? I thought this was simple, now you could write a couple tests and prove out what it’s doing. I find that writing tests is the best way for me to understand what my code is doing. And there was a conversation I had a few months ago on Twitter about folks that use the debugger all day long versus folks that almost never do I very rarely use the debugger. Usually, I’m debugging a test to figure out like why is this test not doing what I expect? But because I have all these tests, I know what the code does, I don’t have to debug through it to figure it out. Legacy code from a new client that I’ve never seen before. Yes, I’ll use a debugger to kind of step through and see what’s going on. But for code that I’ve been writing, and I’ve got a bunch of tests for, I find I don’t need to jump into the debugger nearly as often.

Jeff Doolittle 00:21:55 So to summarize that, what I think you’re saying in general is testable code tends to have loose coupling, high cohesion and good encapsulation.

Steve Smith 00:22:04 Yeah. So it doesn’t necessarily have to have good encapsulation, but yes.

Jeff Doolittle 00:22:07 Tends to.

Steve Smith 00:22:08 Tends to, yes.

Jeff Doolittle 00:22:09 Tends to. And knowing what the interfaces are between things can make things more testable as well because you’re now depending on abstractions instead of concrete implementations, which hearkens back to the design patterns we were speaking to before.

Steve Smith 00:22:20 That’s right.

Jeff Doolittle 00:22:21 So you talked about legacy systems, and I do want to shift gears and talk a little bit more about testing and quality assurance. But let’s start there. Let’s start with a lot of our listeners are dealing with brownfield systems and maybe they’ve heard of the strangler fig pattern and things like this that Martin Fowler talks about in his website and I’m sure others have spoken to as well. But where do you start with introducing quality practices into an environment where maybe those have been lacking?

Steve Smith 00:22:48 Well I would start with source control. Believe it or not, there are some clients out there that aren’t using source control. Some companies and teams and occasionally they come knocking on our door. So the first thing we do is we get them on GitHub using Gits if possible because that’s our best recommendation for that and have some source control. Like now you’ve got source control. That’s table stakes, that’s great. Next step is you want to make sure that stuff works on someone else’s machine besides yours. And the easiest way to do that is to set up a GitHub action or an Azure DevOps whatever they’re called pipeline or team city or whatever you want to use for your CI server and build the thing on another machine, right? That tells you that, this thing isn’t dependent on some setting and your registry or some, dependency that only you have installed.

Steve Smith 00:23:30 And goes a long way toward making sure you, you don’t have the works on my machine syndrome. That a lot of times you run into once you have that and your initial CI script is just build the code, then you add build the code and run the tests, right? And ideally the most important tests to run are your unit tests because they’re going to be the easiest ones to put in your build pipeline because they don’t have any dependencies on anything. So you don’t have to say, oh well I would put that in the build pipeline, but I need a database and I need a web server and I need this and I need that. Like no, it’s just run your unit tests against your package of your code. That’s it. And that should be super trivial to do in your build server and maybe initially you don’t have any tests, that’s fine.

Steve Smith 00:24:08 Now that you have the infrastructure in place where your tests matter, right? I wouldn’t lead with tests. And the reason is, if one developer decides they’re going to write tests and nobody else has bought in and it doesn’t have anything to do with their build release cycle, then those tests are going to break from time to time and no one’s going to care except for that one developer and they might yell and say, hey, you broke the test. Like nobody cares, it’s not their problem. You wanted to write those tests; you go fix them. Whereas if you put it as part of the continuous integration script that is used as a gate before your code deploys to the next environment, now when tests are broken, everybody cares, right? You got to fix that or else the script isn’t going to publish your code to the next step. And so once you have that in place, now testing becomes important to the whole team and everyone can start to get bought into the value of them.

Jeff Doolittle 00:24:53 Unless they go, and they comment out the test. But that, we’ll–

Steve Smith 00:24:57 Talk about that, that does happen sometimes.

Jeff Doolittle 00:24:58 When we get to culture, we’ll talk about that and that’s unfortunate, but it does happen. So a little bit more though about I’ve got a legacy system, and I don’t have any unit tests and we want to start testing the system. Like what would you do to start validating the at least current behavior of the system? So you could start with confidence introducing some of these changes in patterns and in testing.

Steve Smith 00:25:20 Right. Well there’s this idea of a testing pyramid where you have like manual tests and user interface tests and an integration test and then unit test. And the reason it’s shaped like that is that you generally want to have more of the things at the base of the pyramid, which in this case is unit tests and fewer of the UI and manual tests. It may be in a legacy system that is not very unit testable. And so you may have, they only have like manual testers and that’s it. Maybe they’re using playwright or some other thing to automate some UI tests, right? Take whatever automated tests you can find or quickly write and put those into the build process as soon as you can. And then you’ll almost certainly be able to find some places where you can do unit tests, right? Pure functions that just rely on their inputs and give you some output as a result are always unit testable.

Steve Smith 00:26:03 And there’s usually a way that you can find some places in the code where you can extract something out and turn it into a pure function, right? Maybe it’s that if statement that’s like seven lines long and test a bunch of different random things like take that if statement, put it into a pure function that says, if some good name for what that condition really means and put all that other logic in it. And now you can write a whole bunch of unit tests that say, hey, if I have this and that and the other, it should be true. And if I have that and the other, it should be false, right? Those are your first unit tests, they’re like pick the easy things that you can pull out and then you just kind of grow from there. You want to refactor, and we haven’t talked about refactoring yet, but you want to refactor that legacy system to make it easier over time to test.

Steve Smith 00:26:40 And ideally when you’re refactoring you want to have some tests and initially that might just be manual tests or UI tests. But as you progress you’ll start to have a larger and larger suite of unit tests that are helping you. And the thing to remind yourself or the client or the team is that you didn’t get here in a day, right? Usually you have many developer years of effort went into building this legacy system with no idea about tests or code quality even necessarily, right? And so now we are going to start to turn the ship but it’s a big ship and it’s not going to turn quickly. Right? Or for another metaphor, like if you have neglected your health for 20 years and now, you’re overweight and not as fit as you would like and you decide to go to the gym like a week later, you’re still going to be pretty much high the last 20 years of footy, right? It’s going to take some time to change the direction there and it’s the same for your code base. If your code base is unfit, it’s going to take a while for good habits to move it back into a fit course.

Jeff Doolittle 00:27:38 I will point out for people just getting started too, one of the things I’ll often recommend is, especially now with a lot of the advanced CICD and things that you can do, even with GitHub actions, you can spin up docker containers to do some of those infrastructural pieces that you need in order for some large scale infrastructure tests to run. And so I just encourage people don’t run away from making a legacy system testable, even if it’s integration testing. While you need to have a larger quantity of more focused unit-oriented tests, you should still also be able to test the whole system and there are ways to do that. So definitely want to encourage people to still do that if they want to improve the quality of a legacy system.

Steve Smith 00:28:15 Yep. I agree. Definitely being able to put a database in a container and spin that up as part of the CICD process is great.

Jeff Doolittle 00:28:20 It’s amazing. You can get RabbitMQ running in those if you want to test your maybe using mass transit or something like that for a message bus. Thereíre all kinds of things you can do now with Docker that were a lot more challenging to do even just 10 years ago, definitely 15 years ago.

Steve Smith 00:28:35 Yeah. One thing I’ll point out on that before we move on is separate your tests into different projects if you can or have a way to run different kinds of tests at different times. I usually will separate unit tests and integration tests at a minimum, mostly because of speed. I can run unit tests; I know in any environment whether it’s locally or in a pipeline and they should all run in seconds. Whereas if we are talking about standing up docker containers in a GitHub action or in a pipeline, a lot of times it’s going to have to pull in that docker container and that’s going to take time. Not to mention even just running the test and then resetting the state of that infrastructure. So those integration tests will probably take minutes, not seconds. And so you’re not going to want to run those in the same scenarios necessarily. And if your test suite always takes 12 hours to run, it’s not as useful as a test suite that takes five or 10 minutes to run. And so you want to be able to at least split those out so that you get the fast feedback from the unit test and then in a pipeline somewhere that maybe it takes 10 or 20 or a half an hour, right? You get the full suite of integration and functional tests.

Jeff Doolittle 00:29:35 Absolutely. And multiple ways to do that. Right. You mentioned different projects, you can also with attributes you can just like segment and say these are integrations.

Steve Smith 00:29:43 Some way to be able to run them separately.

Jeff Doolittle 00:29:45 That’s right. That’s right. But again, and doing both, but that ability to have the quick feedback is absolutely super helpful for teams and I think for legacy teams who start to have that experience too, it can become addictive, in a good way. Like in a healthy positive way, right?

Steve Smith 00:29:58 So a good feedback loop you can with certain tools, like I typically use Visual Studio, but there’s other tools you can do it from the command line in the .net ecosystem where you can just have the test run continuously, right? So there’s live unit testing as a feature in, in the IDE, but even at the command line you could say .net watch test and it will just, every time you change a file it reruns all the tests. So, even if you’re using VS code on a Mac and don’t pay Microsoft anything, like you’ve got this feature where you can be seeing your unit tests running continuously as you save one line of code.

Jeff Doolittle 00:30:28 That’s right. Well I use VS code on a Mac, but it doesn’t prevent me from giving Microsoft money for other reasons. But that’s a whole another topic of conversation. So let’s switch gears a little bit to dive a little bit deeper into testing and how it relates to quality assurance. And to kick that off, first question, are there any misconceptions that people have about testing?

Steve Smith 00:30:45 I think there’s maybe the, just the misconception that it’s really hard or if you don’t have it, it’s hard to get started. Like if you’re the developer on the team that wants to try it, that it is this big uphill journey to try and get buy-in from everybody to add a class that runs some tests. Like it’s not a big deal and you can start it on just your machine and see how it’s useful to you don’t even check it in, right. Just keep it locally and then when you at some point you’ve got a bunch of them, and you can show everybody else how great they are then you can kind of reduce the fear or the concern about how big they are. because for folks who have never done it may seem like this scary thing, right?

Steve Smith 00:31:21 Just because it’s not in their comfort zone. So being willing to just step in and try it I think is huge. The other thing that I find is a good way to get teams to start using tests that aren’t used to it and kind of change that culture is for bug fixing. Right? So if you have a rule that says bugs are not closed out as fixed unless there’s a test showing that that bug no longer exists, then if that’s the only time you ever write tests, your code is going to start to get much better quickly. Because guess what? All the places in your code that are most error prone are going to be the ones that have the most tests in a very short while. And that’s going to force you to refactor that code to be looking at that code, trying to improve it and of course adding tests to it to make sure that the next change you make doesn’t break and introduce a regression that you already fixed last month because now you’ve got a bug fix test in place.

Jeff Doolittle 00:32:08 Yeah. That’s the aha moment for me when I first started testing years ago was when we found and we started doing exactly what you said it was, okay, if a bug is found, the first thing you have to do is actually you have to have a test that fails. That proves the bug exists. Once you have that, then you have to change the code to get the test to go from red to green. Yep. And the real aha moment was when one of us on our team did that and something else that had been green turned red and we went uh, and we caught it. Yeah. And it was fantastic because now we weren’t playing whack-a-mole and having a customer tell us, hey you fixed that but you broke this. Which is the worst. It’s a morale killer.

Steve Smith 00:32:45 All your tests just paid for themselves in that moment.

Jeff Doolittle 00:32:47 They did. And this was years ago, and it was when we were just getting started. But it was fantastic just to have that the whole team went ah, like okay. It’s like yes, this is now I get it.

Steve Smith 00:32:54 This is why people talk about this.

Jeff Doolittle 00:32:57 This is why we do this. Well and it’s funny too because it’s how the real-world works. I mean you don’t build an automobile and just crash test it, right?

Steve Smith 00:33:03 Yeah.

Jeff Doolittle 00:33:04 You tested fuel pumps; every component was tested.

Steve Smith 00:33:05 Yeah.

Jeff Doolittle 00:33:07 And you tested materials and you like going all the way back up the supply chain. You’re testing every aspect of the supply chain so that when you put the car together, and I think this is something that you’ve probably realized too, is crash testing the car is not quality assurance. That’s quality control. Sure. Yeah. Well, testing your UI when you’ve already built the product is not quality assurance. That’s quality control. Right. And quality control is important, but it only tells you that you missed some serious problems you need to go back and fix. But it doesn’t fix any problems.

Steve Smith 00:33:34 Right. That best it to detect them.

Jeff Doolittle 00:33:36 Yeah. And that’s another misconception I think people have. How about this one? What are your thoughts on code coverage? Like, we’re going for a hundred percent on everything. Like is that

Steve Smith 00:33:45 For a lot of legacy systems, I like to just see code coverage going up. And that’s what the team maybe gets some additional confidence around. And for really big systems, right? It’s going to start out at like 0.01% because you added like one test, right? And then over time it starts to grow. I am not concerned about a hundred percent code coverage. I want to test the parts of the code that are important if they aren’t working. And almost always that means writing tests for things that have conditional logic in them. So a key metric that I use is psychomatic complexity measures, but only on the, on the method level. If you try to apply it at class or higher level, it doesn’t make any sense. But at the method level, if you can keep psychomatic complexity down like under 10 and for folks that don’t know that, there’s a good definition of Wikipedia, but it essentially, it’s like how many different ways can logic flow through this method?

Steve Smith 00:34:34 And so if you have an if statement that says it’s either this or it’s that, then you have two ways, right? It’s going to hit one side or the other of that if statement. And many times we’ll find in legacy code the psychomatic complexity of some of the, the most important parts of the system in the hundreds for a method. And that’s just insane, right? Because you really want to have a test for every one of those cases. So you would need like hundreds of tests for that method to fully exercise it and you could get to a hundred percent code coverage on that method way before you covered all the different combinations of paths that might lead through it. So keeping that low pays big dividends. So that, that’s pretty much where I try, and focus is not so much on a particular percentage. More is usually better, but only if it’s interesting code that you care if it fails or, or that it could fail, right? I don’t test properties — in .net, C# code, we have properties as a built-in feature; they work. The language team proved that they worked. So I don’t have to write a test for that’s already been tested. That’s not necessary.

Jeff Doolittle 00:35:30 No, that’s interesting you correlated that too with conditional logic because that makes a lot of sense. That’s where you’re going to find those edge cases or things like that and that’s where it might make sense to say we’re going to aim for a hundred percent coverage on this conditional thing. Maybe before we start refactoring it and trying to fix it. Because we want to have confidence as ref refactor that we’re at least not regressing and making things worse than they were when we found them, for example.

Steve Smith 00:35:54 Even things that aren’t properties that just have only one code path through it. Like you, you extracted out a method that does four things and it’s just some helper method that has no if statements in it like it does those four things like there’s no other way it could possibly go. So you don’t necessarily have to test that method. You probably test the method that calls it but you don’t necessarily have to test that separate method.

Jeff Doolittle 00:36:16 Yeah. That actually goes back to another design principle we didn’t mention before, but command query separation from, I think it’s from Bertrand Russell. The idea that a method should either change state or report on state but not do both at the same time. And that’s another one of those principles that people can consider when they’re trying to make their systems more testable, for example. Because if I know this method’s going to change state, then I’m testing this state change and if it’s only going to report on state, then I’m just testing that it’s in the appropriateÖ

Steve Smith 00:36:43 Result to expect or whatever.

Jeff Doolittle 00:36:44 Yeah, exactly. Exactly. So that can help as well. Any other misconceptions? Maybe people have, we’ve covered a few, but a lot–

Steve Smith 00:36:49 Of folks get hung up on mocking and some folks think that the only reason that you would ever have an interface in your, in your code or an abstraction is because it allows you to mock things for your test. And sometimes that’s useful, but that’s not the primary benefit or reason. I think too many folks kind of go down the rabbit hole with mocking and I look at their test and like one test method is like 50 lines long and the first 40 are setting up all the mocks. Like, that’s indicative that you have a problem with your code that you’re trying to test. And it’s like way too coupled to a whole bunch of dependencies and things. So if you can refactor the code so it doesn’t need that. And ideally try and use as many pure functions as possible. So because those are easy to test, you’re going to have a much easier time with testing. So don’t be afraid to mock a couple things here and there. If it makes it easier to make a test for something, but if you find that you’re spending an inordinate amount of time trying to deal with and set up mocks, that’s a smell that’s telling you your design could be improved.

Jeff Doolittle 00:37:41 Yeah. You mentioned before like a controller that might have 13 dependencies injected into it or something like that. And then if you are trying to test that control and you have to mock out all 13 of those things, or to your other point, if you’re only testing one method on it, but it only needs two of those 13 things and you only mock those two something’s telling you that maybe you should refine your design as opposed to just continuing to go down this path of complicated mock setups and things of that nature.

Steve Smith 00:38:09 Right. Another common thing I see, especially for folks new to testing is they don’t treat the test code like real code in terms of trying to refactor it and keep it clean and maintainable. So really common one is just the new keyword again I talk about new is glue. Like if you have the new keyword newing up the system under test in every single test for a certain class and later on this big very important class has 50 different unit tests and every single one of them instantiates it and then you’re like, oh, we need this dependency, add it to the constructor, great. Now you’ve got 50 tests you have to go fix because every one of them says, oh, I don’t have a constructor that has that thing. Don’t do that. There should be like one place in your test suite where you instantiate the system under test and then all the other tests use that. And just by doing that one simple thing, now when you make changes to the system under test, it’s less painful, it’s less friction. You’re not going to be resentful of having to make a change because oh, now I have to go fix all the tests. Right? Like no, make your job easier. Make it so you only have to fix that one place where you instantiate that class and not every place that you have a test for.

Jeff Doolittle 00:39:09 Yeah. We had Chad Michelle on the show a couple months ago and he wrote a book called Lean Software Systems Engineering for Developers. It’s a mouthful, but one of the concepts from the book, I really appreciate how they put it, they call it designed for change. And I think that’s exactly what we’re all trying to do and we’re trying to avoid that pain as we should. Sometimes changes are painful, but we can make them less painful. And when we designed for change then it just makes everybody’s lives better.

Steve Smith 00:39:33 Right. Yeah.

Jeff Doolittle 00:39:35 So let’s talk about testing best practices. So what does a robust testing strategy look like in your experience for a software development team?

Steve Smith 00:39:44 One of the things, speaking of teams and I know we’ll get to culture here in a little bit, but something I often find as a problem in teams is that they have a separate QA team from the developers. And almost always I find it’s better if we’re able to integrate those and have QA on the team with the developers, building the test as part of the iteration or sprint and working side by side with developers on automating tests and other things. There’s that first off, if you’ve got a separate QA team, that’s only a hindrance in my experience, assuming that you’ve got folks that are invested in quality and in writing tests on the team, then the way I would approach it for legacy code is to start with some tests that are probably high level and do like playwright tests or UI tests that prove the most important things work.

Steve Smith 00:40:30 Maybe these are like smoke tests that you run right before you go to production. Can the user log in? Can they add items to their cart? Can they check out with a fake credit card or whatever. Because those might be the most important things that the application does. And so once you’ve got some of the minimal stuff in place to give you some confidence that the code works, then you work down from there. And like I said, I’m a fan of writing unit tests for the business logic where the most important complexity lives. And so start to write tests for that. If you can’t write unit tests, write some integration tests for that, but eventually you want to get the code into a state where you’ve refactored it to where the business logic decisions live independent of the infrastructure.

Steve Smith 00:41:07 And now that unlocks the ability for you to unit test those which pays huge dividends. When business logic changes, which it often does on these legacy systems, it’s like, we want to ship a new feature or this regulation changed, or we need to add another payment provider. We need to change how sales tax works because we just started. We have a customer in Brazil now and they have a whole different set of laws like that stuff’s going to happen. And if everything is just hard coded and coupled together and directly into a database or whatever, those are really hard to do. And the first one or two might not be too bad, you’ll just add some more if logic in there. But after 10 or 12 locations and strategies and different approaches, it becomes a big spaghetti code mess and suddenly all your progress direct grinds to a halt.

Jeff Doolittle 00:41:48 What about other kinds of testing, like performance testing and security testing? How do those relate to quality in your experience?

Steve Smith 00:41:54 I think those are important, but it’s a different skillset. I think performance testing. I’ve done talks and courses on performance testing and measuring performance and scalability with load tests and things. Not all companies have the resources for that and a lot of times it’s more of a, if the customer’s telling you this page is slow, that’s when we need to worry about that. And in that case before you go doing performance tuning, it’s just like when you are going to fix a bug, you should have a test for it. You should have a test that demonstrates that the performance isn’t satisfactory so that when you tune it, you can rerun that test and say, look, I fixed it. Right? It’s a bug that it’s not fast enough. So treat it like one and write a performance test for that and then you’ll know when you’re done.

Steve Smith 00:42:35 Because otherwise it’s really hard to say when you’re done performance tuning and developers can spend an infinite amount of time trying to make something a little bit faster when really you get the biggest bang from some low hanging fruit like adding caching or adding an index and you don’t have to go and like try and tweak how every method is called and whether or not it’s on the heap or the stack. Like most of that doesn’t matter when you’re talking over the network to a database or a distributed system. Security is definitely important and there’s third party tools you can use to like scan your code for known security issues. I would encourage every developer to keep an eye on the OWASP website, which kind of lists like every two or three years they list a top 10 of security problems.

Steve Smith 00:43:13 Definitely keep an eye on that and understand how to look for those. One of the biggest ones remains SQL injection. So make sure that your site is not taking in user input and using it to build queries against your database. That’s still one of the biggest risks out there, so at least be sure you’re not introducing those into your system. But then there’s also a place for third party security audits from time to time, where folks that that’s all they do can come in and take a look at your code for a week or two and give you a laundry list of things that you could be fixing. And that’s probably more effective for most organizations than trying to have all of their developers be up to speed on every security exploit all the time. Because I don’t think that’s tenable.

Jeff Doolittle 00:43:49 Yeah. And we don’t have time to get into in this episode, but I think it does somewhat relate to quality. We talk about something like threat modeling and actually considering the threat vectors for your system and what your strategies will be for mitigating or eliminating those threats. Which again that I think relates to quality because if your system can be hacked and taken over, I think you’ve got some serious problems there.

Steve Smith 00:44:11 Sure. Things that are quality related to that too are, the more modular your system is, the more places you have where you can insert additional functionality. For instance maybe initially you don’t have any way to know what an attacker did if they get into your system because you don’t have any logging, right? Well you can easily add a bunch of logging in, like decorators or chain of responsibility pattern or, or behaviors Instead of having to go touch every single service and every single controller and every method and be like, hey, enter this method. Hey exited this method. Like there’s ways when you write your code using good patterns that you can get that type of behavior across the board. Or do a security check as some middleware across the board. Without having to go do surgery on every part of your application.

Jeff Doolittle 00:44:55 Absolutely. In fact, I forgot to mention it before, but I’ll mention it now. One of the first books I learned on refactoring things to make them more testable, and that’s not the title of the book, but the title of the book was Refactoring to Patterns by Joshua Kerievsky. And I’ll put a link in the show notes, but for people who are wondering how I take that nasty if L switch case statement that’s got a psychometric complexity of a thousand and start cleaning it up. Books like that I think are a good place to get started. Giving you some creative ideas like you mentioned chain of responsibility pattern, again, we’ll put a link in the show notes, but just giving people creative ideas to think about, oh, maybe there’s other ways to reduce the complexity in this code and make things more understandable and also more testable, which improves the quality of the system overall. Let’s talk about building a quality culture. So I know you do a lot of consulting and you help companies to improve their development practices. So what in your experience does it take to help build a culture of quality in a software company?

Steve Smith 00:45:48 I mean, that’s a big topic. There’s, there’s whole books on that. There’s a great book that I would recommend that the technical leadership in those companies read, which is Accelerate, which has a bunch of scientifically proven things that you can do that lead to better impacts and outcomes for your software process. You’ve probably talked to people on this show about that book in the past, so I won’t get into it too much, but that would be a good place to start is the practices that are outlined in that book and the buy-in that it’s not just voodoo magic and some consultant coming in because they get a paycheck saying you should do this. No, this is scientifically proven practices at this point. And so getting that buy-in from leadership is usually an important first step because it’s difficult to do change in an organization from like the individual developer individual contributor up the stack.

Steve Smith 00:46:35 But if the technical leadership buys into this, then that can help. Now the next step is how do you get those individual contributors on board? And that can be a challenge, right? A lot of folks will be eager to learn new things, eager to write better code, eager to work on a system that’s more modular and isn’t the big, tangled spaghetti code mess? Other developers maybe they’re later in their career and they’ve been working at the company for decades and they’re like I don’t really see a need to change. I’ve been doing this for 20 years and it’s always worked for me. And that can be more of a challenge. So ideally you get everybody on board and if you can’t, then sometimes you reorg a little bit and you have like developers that aren’t as interested in learning these new techniques are put on teams or put on code that they can be successful in without having to if that’s possible.

Steve Smith 00:47:17 Sometimes it’s just a matter of changing your team and maybe you make sure that the next few developers you hire all have the same mindset that you’re looking for. And sometimes unfortunately you do have to let other developers go that don’t. But that should be a last resort. And ideally you can bring everybody along and those developers that have been there for a long time also have a ton of understanding of how that legacy system works. So you definitely don’t want to lose them just because in the first week they don’t jump up and down to do unit tests. So I think those are like two ends of the spectrum, right? The management leadership side, the individual contributor side. If you can get the team to be self-empowered to the point where they can decide how they’re going to achieve some of these things, that can help a lot, right?

Steve Smith 00:47:56 So it’s not just dictated from on high the CTO says we will now have code coverage of a hundred percent period, or you’re fired. Like that doesn’t change the culture and it doesn’t necessarily work great because it doesn’t take long for developers to figure out that they can write one unit test that iterates through all the classes in the system and hits all their properties and then returns through like, hey, you just got almost all the code coverage with that one test. Like, did it do anything useful? No, but your game in the system and developers are usually pretty good at that. So it’s more important that you show the results that you’re going for than that you have some arbitrary metric that may be difficult to hit.

Jeff Doolittle 00:48:30 Absolutely. Let’s pay people by lines of code. Yeah. The adage is, and if it can be measured, it will be gamed and to your point, right? Developers are really good at that. Yep. This does make me think of a concept, I don’t think it’s in his book, but Yuval Loi, who’s a trainer of software architects, he says that you should have zero tolerance for defects. And that doesn’t mean you necessarily have zero defects because sometimes defects up through, but you use zero tolerance. If a defect is found, stop the presses, we’re fixing it, we’re going to eradicate that defect,

Steve Smith 00:48:59 Right? Yep. And that’s a lean principle stop the line.

Jeff Doolittle 00:49:02 Stop the line.

Steve Smith 00:49:02 Jeffrey Palermo a friend of mine, I don’t know if this was his original quote, but he said something that resonated with me, which is that if you have a process that produces defects, then you have a defective process.

Jeff Doolittle 00:49:12 Yeah, I like that.

Steve Smith 00:49:13 I like that. That’s very related.

Jeff Doolittle 00:49:14 Yeah. Well, and let’s talk about pull the cord because may maybe not everybody’s familiar with what you’re talking about.

Steve Smith 00:49:20 Sure. So there, there’s a great book by Mary Poppendick called Lean Software Development. It has some principles, but one of the key practices that Lean follows in manufacturing setting is that if there’s a defect, you stop the line, meaning stop the assembly line, everybody stops work. And you don’t just fix that one defect, you fix the thing that allowed that defect to happen. So that defect never happens again. And in software that mean you write a unit test that says that bug is fixed and it won’t happen again. Or you change the build pipeline so that you actually run the unit test or whatever it might be that would make it so that that failure that you’ve just discovered, you don’t just fix it, you fix the things that led to it being possible and then now your process is slightly better for the next time.

Steve Smith 00:50:00 And if you do that consistently, you’ll go faster because you won’t have to keep fighting fires. Most folks that are dealing with legacy systems have management that’s trying to go faster and the Dev team and DevOps team are just trying to put out the fires as fast as they can and they don’t have time to even think about system quality because they’re too busy just dealing with all the bugs and critical issues that are happening all around them. And so you want to build that quality in so that you never get to that state where it’s constant firefighting.

Jeff Doolittle 00:50:25 I’ll point listeners too, if you’re interested to dig deeper into this. Edward Deming, who’s the father of total quality management was responsible for the Japanese transformation of Toyota in the 1950s. Basically took over the world. A lot of this Lean and these kinds of principles comes from what they called the Andon cord, which is a white cord that anyone on the factory line could pull, even the janitor. And if they found there was no defect, this is the fascinating thing, if they found there wasn’t a defect, they still rewarded the person for pulling the cord. Because what they cared about, and I think this is really pertinent to our industry these days, is all of these different methodologies and things we have, which we’re not going to get into now because we’re talking about quality. But it really, in my experience, Steve, and I wonder if this gels with yours, it boils down to continuous process improvement. Whatever methodology you have or whatever practices you have, if you’re not doing CPI, I’d question whether what you’re doing is really helping you or not. What are your thoughts on that?

Steve Smith 00:51:17 Yeah, I would definitely agree. There’s various agile maturity models and things, and a lot of folks don’t like that term maturity in there. But if you look at those, you can see where organizationally and the processes you follow, there’s things that are steppingstones to getting better and better. Like I was saying some teams don’t even have source control, right? And if you don’t have source control, there’s a whole lot of activities that you can’t even start to think about doing until you get that in place, right? And then I was big early on in our discussion about continuous integration or CI most people just say CICD now is like this buzzword that I hear. But very few teams are actually doing the CD part, which is continuous delivery or continuous deployment.

Steve Smith 00:51:57 And to get to continuous delivery deployment requires a whole additional set of capabilities in your team and in your, in your DevOps to be able to support. Like generally you’re going to have to have support for feature flags. You’re going to have to be good at trunk-based development. And that’s not something necessarily teams understand or use or aware of on the first day. Right? And so this idea that you’re continuously improving your process, some of that is going to be novel and unique to you and your team and your org. And some of it is going to be, these are well understood steppingstones in the industry that we know that we can get to. And again, I’ll point people at the Accelerate book, but being able to get to being at the point where you can do continuous delivery where every time someone checks something in, if it goes through all the gates and passes all the quality checks, it’s live like that is kind of like the epitome of where you can get with your software process.

Jeff Doolittle 00:52:46 Absolutely. And I think that’s the idea of the continuous, it’s like wherever you’re at right now, you’re continually improving your process. And then of course there’s metrics and tracking and things like that that all matter. Do you have any specific example of a team or someone you were, you don’t have to say who of course, but where the improvements in quality had such a market impact that it was just like amazing.

Steve Smith 00:53:08 Sure. This is kind of working at it from the other end. So, and I like to tell this story for folks. One of the things that I think helps a lot is shipping more often, right? We just talked about CD, continuous delivery. I think that’s crucial, but even if you can’t get to that, at least doing it more frequently can almost certainly help. There’s a principle or a quote that I’m not sure where it’s from. That’s if it hurts, do it more often, right? If your process of shipping software right now is painful, it will get less painful if you do it more often. So I had a client a few years ago, they would ship software whenever the product manager felt like it was worth shipping. Right? If it had enough features in it that users would benefit from it, then well it’s fine, we’ll schedule a release.

Steve Smith 00:53:46 And so this worked out to be about every six or eight weeks and every time they did this, their software was used seven days a week during business hours. So they would have to deploy at three o’clock in the morning on a Tuesday and make sure that hopefully it worked, almost never did. Right? So there’d be several attempts to deploy between 3:00 AM and 5:00 AM and usually by 8:00 AM they could get the thing working enough and they might still have a few little minor things they’d fix like the next day on Wednesday they might get at 6:00 AM and do a couple more little tweaks. And so I was trying to convince them that you should really be deploying more often, right? Well the developer team, all of whom had to get up once every couple of months at three o’clock in the morning and their families who didn’t really appreciate that either, they didn’t really want to hear from me that, hey, we should do this more often.

Steve Smith 00:54:31 But what I started them to do is for about six months they would record every time they deployed and when that happened and whether or not it was successful, right? And it was successful if you didn’t have to roll it back and you didn’t have to immediately fix some bug that you discovered in production, right? And then relate that to how long had it been since the last deployment. Right? And very quickly you look at the data and it says, if it’s been more than an hour since the last deployment it’s never successful. Right? Or maybe a day, right? More, if it’s been more than a day, it’s never successful. But if it’s been less than a day, it’s almost always successful. And so what we got them to do is start shipping again on Thursday.

Steve Smith 00:55:08 Right? Not much has happened, right? It’s still not the same code that it was. Okay, now ship again on Tuesday. Right? And so they started shipping every Tuesday and Thursday and they found that it wasn’t that hard, it didn’t take that long. If something was broken, they knew exactly what it was because they just did it. And now they’re shipping over a hundred times a year. They don’t necessarily feel like they need to get beyond that, like that’s working for them. But it’s made a huge difference in the quality and the speed with which they’re able to move because they aren’t spending so much time trying to figure this out. And now they deploy it lunchtime, right? It’s not a big deal. So no one’s getting up at three o’clock in the morning and it’s been a huge quality of life improvement for the development teams as well.

Jeff Doolittle 00:55:45 Yeah. I remember that transition in one of my companies years ago as well where everybody was afraid to deploy on Fridays, right? And so we switched the narrative and we said we’re always going to deploy on Friday no matter what. And it only took a few months and next thing, there were never any weekend interruptions anymore. Right. And nobody was afraid to deploy on Friday anymore because of all the things you just described. Yeah. When you have those quality practices, those gates, those tests, those everything in place and suddenly you’re like, oh yeah, this is no big deal. We do this all the time. And it’s transformative.

Steve Smith 00:56:14 If you do it more often, it’ll force you to improve it. If you only, do it once in a blue moon, it’s not worth it to fix it. Because you only do it once in a blue moon. But if you’re doing it a couple every week or a couple times a week, like, wow, we better get better at this, we better be more efficient.

Jeff Doolittle 00:56:26 That’s right. And when you’ve only changed three things instead of 300, it’s more likely you’re going to be able to figure out what went wrong and fix it quickly instead of combing through and finding a needle in a haystack. And I think as we wrap things up, that kind of speaks back to culture as well. What kind of company do you want to work for? Do you want to work with one that’s afraid to make changes? And anytime they do everything breaks and you’re staying up late at night and that creates stress. And that leads to all kinds of problems. At the extreme end, it literally leads to ends of relationships and ends of lives. And so quality matters.

Steve Smith 00:56:58 Quality matters. And working on a code base where everything is modular, there’s tests for things, you’re able to check stuff in and all the tests pass and the code just goes through like there’s no stress, right? Or there’s minimal stress. Things don’t fail that often, itís a night and day difference versus legacy code, no tests, lots of manual testing, lots and lots of cross your fingers and pray whenever you do a deployment. It’s way more stressful in that environment than in the one that I would prefer to work in.

Jeff Doolittle 00:57:26 Absolutely. So as we wrap things up, if people want to find out more about what you’re up to or maybe even ask you some more questions or get your help in improving their quality practices and their company, where should they go?

Steve Smith 00:57:36 Sure. So you can find me online as Ardalis, just about everywhere. That’s spelled A-R-D-A-L-I-S. And so because my name is Steve Smith, it’s really hard to get a username that has any combination of that with consistency. So I’ve been Ardalis online for over 20 some years. You can also reach out to me through my company, Nimble Pros. That’s like Nimble Professionals. And we’re happy to talk to you about your questions, about how to get better as a team or how to write better software or assess the software that you have and see what are those low hanging fruit that we could tweak that’ll get you most of the of the games that you’re looking for.

Jeff Doolittle 00:58:09 Cool. Well Steve, thank you so much for joining me on the show.

Steve Smith 00:58:11 Thank you. It was great to be here.

Jeff Doolittle 00:58:12 This is Jeff Doolittle for Software Engineering Radio. Thanks for listening.

[End of Audio]

SE Radio 637: Steve Smith on Software Quality

Show Notes

From IEEE Computer Society

Related Episodes

Transcript

Join the discussion

More from this show

SE Radio 646: Matthew Skelton on Team Topologies

SE Radio 645: Vinay Tripathi on BGP Optimization

SE Radio 644: Tim McNamara on Error Handling in Rust

Menu

Recent posts

Search

Search

SE Radio 637: Steve Smith on Software Quality

Show Notes

From IEEE Computer Society

Related Episodes

Transcript

Join the discussion

More from this show

SE Radio 646: Matthew Skelton on Team Topologies

SE Radio 645: Vinay Tripathi on BGP Optimization

SE Radio 644: Tim McNamara on Error Handling in Rust

Menu

Recent posts