Cody Ebberson, CTO of Medplum, joins host Sam Taggart to discuss the constraints that working in regulated industries add to the software development process. They explore some general aspects of developing for regulated industries, such as medical and finance, as well as a range of specific considerations that can add complexity and effort. Cody describes how translating regulatory requirements into test specifications and automating those tests can help streamline software development in these regulated environments. Brought to you by IEEE Computer Society and IEEE Software magazine.
Show Notes
Related Episodes
- SE Radio 523: Jessi Ashdown and Uri Gilad on Data Governance
- SE Radio 571: Jeroen Mulder on Multi-Cloud Governance
- SE Radio 342: István Lam on Privacy by Design with GDPR
Transcript
Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.
Sam Taggart 00:00:35 This is Sam Taggart for SE Radio. I’m here today with Cody Ebberson to talk about navigating regulated environments. Cody is a co-founder and CTO of Medplum, a developer platform that provides tools for security, interoperability and compliance in the healthcare sector. Cody began his career as a software development engineer at Microsoft over 15 years ago and has since held various roles in a variety of healthcare related tech companies such as director of engineering, COO and CEO. Welcome.
Cody Ebberson 00:01:01 Thanks Sam. Happy to be here.
Sam Taggart 00:01:03 Yeah. So we’re going to talk about regulated industries. Why don’t we start by just defining what we mean by regulated industries?
Cody Ebberson 00:01:09 Yeah, it’s a great question. I think it applies in quite a few different places. Typically, people think of it as, I mean, everywhere has regulations, but when we think of excessively regulated or high regulation, it’s places like healthcare or finance or security where there’s quite a few additional layers of requirements to make sure that you’re meeting legal requirements, ethical requirements, you’re protecting user rights, user data, user safety, and they’re all over the place. And I think there’s regulation everywhere, but maybe there’s a spectrum of some that are more or less regulated.
Sam Taggart 00:01:40 Yeah, I was going to say a lot around financial stuff I imagine and personal information and those type of things as well.
Cody Ebberson 00:01:47 Absolutely, yep. Certainly compared to something like video games where, it’s fun, but when you’re talking about financial data or healthcare data, I think people have a higher expectation of reliability and security.
Sam Taggart 00:01:57 Yeah. So who makes these regulations? Like where do they come from?
Cody Ebberson 00:02:01 That’s a great question. It’s typically a government or an industry standards body. So in our case, we’re primarily in healthcare. It’s usually originating from healthcare government bodies like HHS, Health and Human Services. And they provide a whole bunch of different regulations. Most famously HIPAA, but there’s quite a long list. There’s also quite a few industry standard bodies and customers will often dictate those. So where the government might not fully regulate something, the market ends up filling those as well.
Sam Taggart 00:02:30 Is that like the, what is it, PCIS or the credit card?
Cody Ebberson 00:02:33 It’s finance. Yeah, PCIS. Like SOC 2 is a big one to make sure that you have adequate data and security controls and those are typically driven by the market.
Sam Taggart 00:02:42 And what are these regulations generally trying to prevent?
Cody Ebberson 00:02:45 I think there’s a lot of it that’s, you might hear about it and think, well that’s kind of common sense, right? Like that the user’s data is being adequately protected, that you have sufficient controls to make sure that like a rogue software engineer can’t go and steal data or do something malicious. That it’s protecting the user’s data to protecting the user’s safety on the ethical side, not that you can truly regulate ethics, but trying to codify those in into systems so that the software is going to be reliable. I mean I think the CrowdStrike attack from what that last week was or two weeks ago, was a notable one of there was a lot of damage that was financial damage, human time wasted. Those regulations are often put in place to try to protect against incidents like that.
Sam Taggart 00:03:27 So how do regulations vary over geography and what challenges have you run into with that?
Cody Ebberson 00:03:33 Thatís a great question. Regulations are often quite different cross-country boundaries. The US healthcare system is perhaps one of the largest healthcare systems. It has its own idiosyncratic regulations, but there are quite a few that go across international boundaries as well. For example, sometimes there’s data format requirements and there’s regulations to try to improve interoperability across those different systems. There is a push from the ISOs so there are many of these ISO standards now which are an attempt to try to standardize and regulate as an international standards body. There’s a long list of the ISO standards and the US government is oftentimes steering into those to align with like the rest of the world.
Sam Taggart 00:04:14 So in the US we have the federal government, we also have state governments. Do you notice any differences between states that apply to healthcare in particular since that’s your area?
Cody Ebberson 00:04:23 There’s definitely a lot of state level requirements for the practice of medicine, so on doctors and nurses and the variety of different rules. As a healthcare provider with regards to healthcare technology, the rules are typically pretty similar. One thing that’s interesting for healthcare is there’s the notion of an HIE and Health Information Exchange, which allows regional hospitals to share data with each other. And you can kind of think of those like a homeownerís association where they have all their special rules that define who can share what data between different hospitals. And that’s very, very regional, but not typically so much from the state at that point. It’s typically going to be the market driving those rules.
Sam Taggart 00:05:01 So have you ever encountered regulations that are at odds with each other and what do you do in those situations?
Cody Ebberson 00:05:07 That’s a great question. In healthcare, a good example of this would be the US government involved. A lot of governments are now pushing the right to be forgotten, for patients to be able to request their data to be deleted and purged and which is a very sensible consumer protection. But then on, if you think about the medical legal liability requirements where you have to retain many of your records for seven years and up to 18 years in some cases. So now you have to retain records for legal purposes. And those two are totally at odds with each other in practice. Those don’t come into conflict very often. And when they do, that’s when you call up your lawyers and try to figure out how you’re going to handle that. Putting data into vaults or escrow or trying to semi anonymize data. It’s often a long and complicated process, but when the conflicts come, you typically get the lawyers involved.
Sam Taggart 00:05:53 Yeah, I was going to say, I would imagine if you told your doctor you were allergic to something, you would probably want him to remember that.
Cody Ebberson 00:05:58 Yes. Or if you had a surgical operation and the statute of limitations on how like the process of suing for a complication can be available for up to seven years. But if you want to have your data deleted, like how do you reconcile those things?
Sam Taggart 00:06:12 So have you noticed similarities across different industries and what are some of the common requirements?
Cody Ebberson 00:06:18 Yeah, absolutely. And that’s where, so SOC 2 is a big one that’s very cross-cutting and there’s a lot of common-sense things in there. Sometimes it’s difficult to operationalize, but it can be common sense of making sure that you have like you’re using SSL or TLS on all of your cloud endpoints and that you have adequate data protections. So those are cross-cutting. And then it certainly, if you’re going to do any work with the federal government, then you have something like FedRAMP, which is kind of the gold standard of all regulations and it’s going to make sure that you’re running continuous security training, you have all your security audits, your pen tests on a regular basis. You have to demonstrate that you can respond very quickly to security vulnerabilities or changing security requirements. And that’s not industry specific at all. Anytime you want to work with a federal agency, you have to make sure that you’re conforming with those requirements. And now the market has adopted those rules in some cases as well. So even if you’re not working with the federal government, you may need to say, okay, yes, we’re FedRAMP compliant.
Sam Taggart 00:07:16 Okay. How do these regulations affect developers? How do they translate down to the code level?
Cody Ebberson 00:07:22 Absolutely. That’s a great question. And I think that there’s the surface level knee-jerk reaction that most software engineers have, which is gross. I don’t want to deal with that, right? It’s going to be a long slog, a lot of complicated, boring requirements. I think that there’s a more positive interpretation, maybe this is my personal bias, I actually personally kind of like them because I view them as just a very clear and explicit bright line set of constraints. And software engineering often is about working with constraints and using constraints to define project specs and specs, translating into feature requirements and feature requirements translating into code. So for example, our organization, we’ve made it one of our core competencies of taking those rules and regulations and translating into unit tests, translating into integration tests so that every time we run our CICD pipelines we’re going to know very clearly like this change break or violate one of those compliance and certification programs.
Sam Taggart 00:08:18 Do you have a specific example?
Cody Ebberson 00:08:20 Yes. So everything inside of our SOC 2 that we can translate into CICD checks, we have a lot of those would be mostly on data privacy and protections, like things like access controls. I think most systems inevitably have a gazillion checks for permissions and access controls, making sure that various groups can only access the data that they’re supposed to have access to, et cetera, et cetera. So there’s things like third party tools here as well. So things like we use Sonar Cube for a lot of our automated testing, which will run through a huge battery of security tests. Mozilla Observatory, that’s a great one for all of the like HTTP and TLS and SSL endpoints. All of those tools are great and we try to run them in an automated fashion as much as possible.
Sam Taggart 00:09:04 So that brings me to another question. So when things go wrong and somehow one of these regulations gets violated, where does the liability typically lie? Does it ever lie directly with the developers or is it more of a company level thing or how does that work?
Cody Ebberson 00:09:16 Yes, it can lie with the developers and so that makes us very keenly aware. It makes you sit up straight and pay attention. I think that there’s typically like a remediation period and then there’s an ongoing investigation to figure out how did this go wrong, why did it go wrong? And they do try to look at intent and with what level of seriousness were you operating. And if it’s gross negligence, then that’s where the really big penalties and the big fines come out. If it’s something that was like a very deep and complicated edge case that then there’s maybe a little bit more leniency in those situations. But typically a government agency will roll up their sleeves, dig into what happened and try to make that type of assessment before levying fines and penalties.
Sam Taggart 00:09:58 So something that just popped into my head is how do you balance, and maybe this isn’t a question for you, but just curious if you’ve seen it done balancing the desire for people to actually report incidents so that everybody can learn from them versus if you smack people really hard, they’re going to be reluctant to disclose some of those incidents?
Cody Ebberson 00:10:17 Yeah, the first thing that comes to mind there is like the security industry at large in the culture that exists there of things like bug bounties. And I think that that’s been very fascinating to watch over the last 10 or 20 years how that’s evolved because I think that for a long time, software developers and software organizations generally wanted to swat those people away. If someone came to you and said, hey, I think I found our vulnerability, the first thing you’d want to do is try and shut them up and make them go away. And what I think the industry has found over time is that you should welcome those, you should invite those people and make sure that you have a responsible disclosure program so that the issue can get fixed in a timely fashion. So I think that that has evolved nicely and has done very, very well with regards to like how government agencies enforce this.
Cody Ebberson 00:11:04 It’s perhaps a little bit less mature because it happens less frequently, it is done on a somewhat case by case basis for sure. Certainly in the cases of the kind of gross negligence, there have been cases in the past where there’s just like an open FTP server accessible on the internet where you can download lots of patients’ data and it was just security through obscurity. There was, if you knew the IP address you could just log in and grab a bunch of data. That’s pretty bad. And I don’t think they hold back at all when it comes to penalties and fees and fines. But if it’s you were using outdated encryption library and in like an old version of SSL that had some long tail vulnerability, I think that they’re generally pretty forgiving and understanding in those situations.
Sam Taggart 00:11:45 Great. So what are some of the challenges that you’ve seen complying with regulations?
Cody Ebberson 00:11:53 I think there’s just often like it feels like a large body of work associated with it, right? You sometimes you get the packet and so okay, we’re going to start up a new compliance program for a new certification or we have a new customer that’s based in Europe and now we need to figure out how to make sure that we’re meeting all the requirements for the EU or for Switzerland or whatever. So you typically, there’s a variety of vendors that you’re going to work with who may or may not have automated tools. If you’re lucky there’s a high degree of overlap with some of your existing compliance programs. If you’re unlucky, that could be hundreds of pages of very technical and dry legal documentation that you just need to start going through and translating into product and feature requirements and process requirements. And that’s just, it’s a form of work, but it’s also, that’s part of the value proposition of the industry too. So it’s definitely a challenge. I’d say, if you think about the umbrella of software engineering, we often think about challenges as being hard technical problems, but more often than not, I think software engineering is working with process problems and cross-functional coordination problems and compliance certification is just another layer of that. It’s another kind of role, another party in the room with your compliance and security folks and just making sure that all of those requirements are represented.
Sam Taggart 00:13:09 Okay. Yeah. My next question that popped into my head was the process for taking these regulations and figuring out like how we have to implement them in the software? How does that work? Is that mostly the security people or the compliance people making those decisions or are they doing it with the developers? Are they just pushing it off, the developers being like, you got to comply with this? How does that work?
Cody Ebberson 00:13:28 Yeah, that’s a great question. It’s a little bit of all the above. There are some that are just global, you just have to do it, right? In our case, in healthcare, HIPAA, it’s just kind of like it’s table stakes. And so you have to pretty much have that as part of your process on day one. I think SOC 2 is kind of becoming at that level. Is it becoming a table stakes certification that you just pretty much have to start it and do it. In other cases it might be at the time that you’re evaluating some new like business opportunity, a new customer, a new sale, or moving into a new market. And then it’s perhaps a little bit more of a cost benefit analysis. the lawyers and the compliance team will be in the room and part of that conversation, but you can boil that down to a dollars and cents pros and cons analysis.
Cody Ebberson 00:14:11 Then there’s the as these regulations are not static, they change over time and new rules can get added. And so typically every year or so the rules may change and you have to recertify or go through an audit again. That always happens as well. And in that case it just depends. Sometimes it is a long process of sitting down and translating requirements. There’s some, I will definitely give a shout out to various US government agencies that have now started publishing automated testing tools. So for example, Health and Human Services has started publishing some of their tools. So you can actually take your public HTTP API endpoints and plug it into some of their automated tooling and they’ll run their battery of tests against your publicly accessible endpoints. And that’s a fantastic way to do it. It takes a lot of the guesswork out of it. It takes a lot of the legal costs of pouring through boring documentation, but that’s still relatively rare. I think it’s a great model for the future, but whenever it happens it’s fantastic.
Sam Taggart 00:15:06 Yeah. One question I had was kind of how you maintain it over time. Do the agencies publish when they change regulations? Is it your responsibility to constantly stay up on that? How does that work?
Cody Ebberson 00:15:18 They do publish new versions and technically yes, it is our responsibility to stay on top of all that. And as we recertify, that’s on us. Most of the certification programs there is like an audit step and so there’s approved third party auditors that will go through your product and all of your documentation, all of your policies to make sure that you’re up to date and you’re in compliance. In practice, most organizations will use a security tool or provider. So for example, Vanta is the big player in this space. There’s quite a few other competitors as well who they provide automated tooling that will scan your environments, like actual agents that will run inside your cloud environments or agents that will run on your servers inside your private networks to ensure that this whole battery of checks is actually being met. Then they’ll turn around and produce a lot of that documentation for you. It doesn’t produce a hundred percent, maybe 80 or 90%. So it does a lot of the work, but ultimately as a provider, our organization is still responsible. It’s still our but on the line to work with the auditors and make sure that all of their requirements are satisfied.
Sam Taggart 00:16:23 So you mentioned cost earlier, do you have any idea how much costs the regulations add? And then as an add-on to that question, have you ever run into situations where you’ve had to cancel a project due to cost or do you usually figure out a way to pass that onto the customer? How does that work?
Cody Ebberson 00:16:39 Okay, well one at a time, an approximation of cost is kind of tricky and I think that it does change depending on the size of your organization as well. But, it’s easily measured in the cost of like full-time employees. So it’s a non-trivial cost for sure. But it’s I guess part of the cost of doing business for working in the space with regards to how customers, like the easiest maybe example would be you’re going to move into a new market and in particular going into Europe or like Australia, these are countries that often have very strict data sovereignty requirements and their domestic government agencies will have their own requirements. And then yes, at that point it usually is a costing exercise to go through the requirements and try to make a guesstimate of like both time, money, engineering effort, what’s it going to take. Sometimes you can pass that on to the customer. That’s usually part of the overall like enterprise deal negotiations and it’s kind of a mixed bag where that’s all going to net out. I think most organizations that becomes more of like a strategy question than a technical question of like, is this market, is this geography important to us and do we think it’s worthwhile moving into it? If it’s you, you go for it.
Sam Taggart 00:17:47 Okay. So bringing things back to developers, how can developers be quote ìagileî for whatever that means anymore and maintain velocity in regulate environments? Yeah. Do you have any specific examples of that?
Cody Ebberson 00:17:59 I think it comes back to that idea of constraints and there is this popular perception that compliance certification and regulations is like slow. And I would maybe try to reframe that just a little bit, which is, it can be slower to start, but I personally believe after years of doing this that going through that initial pain of building out all those unit tests, all those integration tests to take all those regulations and translate that into product requirements and translate that into infrastructure, it’s technical, it’s going to run on every build or every deployment. I actually find that if you’re willing to kind of pinch your nose and power through on that, that it, you end up in a place where you actually have very, very high velocity because you now have all these systems in place in an automated fashion and you can ship with confidence, you can build, add new features with confidence, you can refactor with confidence that over time that is like the truest form of velocity that you can as you grow and scale you can ship faster and faster. And that just gets back to, and I think that there’s an element of just software engineering best practices there of making sure that you have your product requirements represented with some form of automated testing so that it doesn’t have to be a big manual effort. It doesn’t have to be a big legal or compliance push that your tools and your processes and your systems are the ones that are really keeping you accountable over time.
Sam Taggart 00:19:18 So you mentioned requirements. What happens when the requirements change? And along those lines, do you separate out product requirements from compliance requirements?
Cody Ebberson 00:19:29 We use a variety of techniques to try to stay organized within that realm. So there’s a very high degree of overlap between those two. As you can imagine, we use a combination of a code organization putting various compliance packages into just different folders or different packages or modules with our documentation and issue tracking, we use various like sub-projects or tags to represent all the different compliance and verification programs. Some of the programs are, are more specific on this point than others. For example, I’ve gone through FDA medical device before, that’s a whole separate process for clearing a medical device. And the FDA has a set of requirements regarding, they call it the traceability matrix. So for any given feature than you have to go and make sure that you’ve identified and iterated all the potential risks. And for every one of those single risks you need to make sure that you’ve iterate through all the different tests that you can use to try to mitigate those risks.
Cody Ebberson 00:20:26 And you have to have a full tagging system that connects every single one of those concepts. And that has to be represented both in your documentation but also in your processes and in your code. So our current product is not FDA regulated, but we’ve tried to take a lot of those same concepts and apply it to our processes just because once you get over the initial hump of doing all that work, then you end up in a good place where you have all that infrastructure, those systems in place, but you can really crank through the documentation and you can add new programs and new certifications relatively quickly.
Sam Taggart 00:20:57 It’s interesting you mentioned traceability matrixes, because that was going to be my next question because we used to do those in the nuclear industry a lot.
Cody Ebberson 00:21:04 Yes. It’s daunting when you first see that, right? I mean I remember as I was first entering healthcare and first encountering FDA and you kind of get those packets of regulation, it does kind of hit you in the face with, whoa, this is going to be a lot of work. But I do believe that if you can power through on it, it creates a lot of value and both like financial value for your organization, but just also technical value in terms of really maturing and leveling up processes and systems.
Sam Taggart 00:21:33 Yeah, I think you built a lot of tooling around that pretty quickly. because otherwise I think you would be hating life.
Cody Ebberson 00:21:38 . Yes, yes. You would need a lot of manual effort to go through and try to reverse engineer all that documentation and all those requirements.
Sam Taggart 00:21:45 So what role does, CICD play in all of this?
Cody Ebberson 00:21:48 We’re big believers in CICD and we have embraced that since day one so that on every commit it’ll go through the battery of tests and if it satisfies everything it rolls straight out to production. And so we deployed dozens of times per day. That is per, it’s not required anywhere. And I still think that it’s probably not quite the norm, especially in healthcare just yet. It’s changing. I personally believe that CICD is a huge role to play if you’re willing to go through that pain of translating all those requirements into the test. And that is work for sure and I don’t think there’s any shortcuts there, but it’s trending in a good direction.
Sam Taggart 00:22:23 So do you find you’re generally able to make those tests reusable across projects or are they specific to a project?
Cody Ebberson 00:22:28 Certain tests, yes. Like anything where it’s like, oh this is an HGPN point, therefore it has to meet all these big basic security requirements. And that generalizes really nicely across the board when you get into the more nuanced and like feature specific then maybe not some of the data requirements with regards to like mandating and requiring sort of like data formats. So if like in our world in healthcare things like HL seven and FIR are very common as standard and required data formats and data representations that those become all of our schema validation and object validation data type validation, all that is very deeply ingrained in all of our tests. And so that applies across the board. So that’s another good one.
Sam Taggart 00:23:10 Good. So I’m personally a big fan of testing development and this idea of like emergent or just in time design, do you find that the regulations force you to do more design upfront or are you able to kind of design as you go or how does that work?
Cody Ebberson 00:23:23 So our team includes mostly experienced industry veterans. And I think that after being in the space, you develop some scar tissue, and you develop a strong desire to do some of that work upfront to make sure that your systems are going to support it. It’s like you’re laying the foundation of the house, you want to make sure that you’re building on a rock-solid foundation, the kind of evolving nature of requirements or oh shoot, we need to add a garage to the side of the house, and we didn’t have a foundation or didn’t, there was no concrete there. Like that never feels great and it feels like you have to do this mad dash and lots of like organizational refactoring and code refactoring and project refactoring. So that doesn’t feel as good. I think that when we do embrace it and we know our list requirements upfront and we can plan accordingly to have everything properly organized, it definitely gives a better sense of security and sustainability as you think about the future and ongoing growth of the project.
Sam Taggart 00:24:17 Yeah, that just brought up an interesting thought in my head for some reason, and that is if you have multiple groups working on different parts of a project, how do you coordinate between them and how do you make sure that they’re all staying in compliance and that like the workflow between the groups stays in compliance? I would think that would be a challenge.
Cody Ebberson 00:24:34 And I think that one of the hardest parts of any software system to test in an automated fashion is like your contract testing or boundary testing, right? And that’s not even so much a unique challenge to regulated environments. That’s just a hard software engineering challenge. And I think what that comes down to is making sure that your tests are represented at multiple different layers. So it’s not just unit test for each of the kind of component modules, but also as you’re deploying to like canary environments and staging environments and whatnot, that you’re running integration tests that properly test the full end-to-end requirements or user stories or what have you. That has to be kind of tested at that holistic level as well.
Sam Taggart 00:25:13 So what mix would you say are unit tests versus end-to-end tests in general?
Cody Ebberson 00:25:18 Good question. I personally don’t love that distinction because I think that there’s kind of a blurry area in the middle there too. for things like data formats or like basic algorithms that conform nicely to a unit test, that’s always better, right? You’re closer to the code that’s going to run faster. The development and test cycle are just a tighter loop and you can just move so much faster. When you get to the notion of like very complex systems and you have like microservices, you’re just, and any degree of multiple services that are interacting with each other now you basically force yourself into a world of requiring those like integration tests and end-to-end tests. I’d say if I were to just put some like simplified numbers on it, we’re probably not too far off from a 50/50 blend of lots of small unit tests and then lots of integration tests that, and those integration tests have to go all the way from making sure that you’re testing against a proper database and a Redis cluster and multiple servers and multiple database instances to make sure that the full complexity of the system is being represented there.
Sam Taggart 00:26:22 Yeah, so do you find then you don’t do a lot of mocking, or do you do a lot of mocking and test stubs and things like that?
Cody Ebberson 00:26:28 We try to avoid mocking as much as possible. Just as a general principle, the key points where if you’re using various like third party cloud services for sending emails or for storing objects in like an in AWS S3 or something like that, we mock those endpoints, but for everything else we have a tendency to avoid mocking as much as possible. Well I think perhaps one of the more controversial ones would be like whether you mock out your database calls and so most of our server-side application that would ever result in something running a SQL query against a Postgres database, we have a strong preference for making sure that those are actually hitting Postgres when as we run our test suites. I personally believe that trying to mock database queries is, for us at least, it’s not a good fit and we, we get a lot more confidence when it’s actually touching the database. And where that has proven to be very valuable is as our project grows in age and you start to go through various versions of Postgres or various versions of Redis and you, okay, the Postgres 12 to 14 upgrade uncovers a whole bunch of subtle changes in various syntactic differences that would not have happened if we had been mocking out those various infrastructure components. So integration testing at that level, I’m a huge fan of.
Sam Taggart 00:27:41 Good. So how does regulation affect maintaining legacy code?
Cody Ebberson 00:27:46 There’s a good question there and I think that there’s, maybe it’s a case where sometimes the regulation can be at odds, what the market wants and certainly what customers want because sometimes the regulation is going to push you to upgrade to newer features, newer security protocols, newer encryption protocols, et cetera, et cetera. And from a software engineering perspective, I think that’s great. We should keep moving the line forward, but oftentimes there for a variety of reasons, maybe some customers a partner does not want to upgrade. And so now you have a bit of a pickle there of how far back you should maintain backwards sup support, legacy support, and that’s kind of a case-by-case basis. I think that there are typically, there’s outs for like grandfathering things in, I think in healthcare one of the funniest examples is faxing.
Cody Ebberson 00:28:34 Typically we think of faxing as crazy outdated legacy technology, but it’s still very widely used in healthcare as an interoperability format primarily because it’s grandfathered in. And I personally think it’s hilarious when you read through these various like healthcare requirements and it’s, oh, mandating this form of encryption and this security protocol, da dah dah, dah, and then you get to the section on faxing and it’s like, try to make sure the fax machine is in the back room so people can’t see the pages as they’re getting printed out. And so that’s a pretty stark juxtaposition. But I think it’s just acknowledging the truth that sometimes legacy systems are important, and you have to maintain them just as a pragmatic matter.
Sam Taggart 00:29:13 Yeah. So how do you balance the security versus stability? Because I would think that, and I think maybe you hinted on this a little bit, but security says that you should update your things to patch the latest vulnerabilities, but then that has to affect stability. And then in addition to that, I assume there’s some cost to revalidate stuff after you change it.
Cody Ebberson 00:29:31 Yes, absolutely. And it’s a very common pattern for us. We, as we’re introducing a new feature we believe strongly in things like feature flags and deploying to production, but in a disabled fashion or running multiple implementations in parallel you have your existing implementation, which is providing the customers with whatever services you’re providing, but you’re running your next version in parallel as a bit of like an internal smoke test or an internal canary and comparing the results of here’s what the old version did, here’s what the new version did, are they producing the same results? And if yes, great. If no, okay, let’s set off some alarm bells and, either that comes back to our team and we need to check our assumptions and see like, hey, is this breaking something? Or if it’s something about how the client was interacting with the service, then we might need to engage them in a conversation and start a migration process.
Cody Ebberson 00:30:26 So obviously it’s a known bad thing to test in production and have your customers do your testing for you. But I think that when you deploy multiple versions and, and have one in a kind of shadow enabled fashion that you can quote, unquote let your customers test for you, but you’re doing it in like a log only fashion until you’ve achieved some level of confidence that okay, we know that all of our existing customers and all of our current usage is we can flip the switch and high confidence that it’s going to work as expected.
Sam Taggart 00:30:53 Yeah, I’ve heard of that before. I believe I’ve heard, seen that referred to as the strangler fig pattern.
Cody Ebberson 00:30:58 Strangler, I’m not familiar with that term.
Sam Taggart 00:31:00 Basically, yeah, I think there’s these fig trees that grow up and they grow around an existing tree and then eventually the existing tree dies, and you just end up with this hollow shell or something.
Cody Ebberson 00:31:09 I love it. Yes, it’s a great metaphor and itís kind of perfectly describes it.
Sam Taggart 00:31:14 You also mentioned logging. Can you talk about some of the challenges with logging?
Cody Ebberson 00:31:19 I mean, one of the big challenges is that we log a ton of data. If you, if a user runs a search and that search includes data from potentially multiple patients or multiple concepts or entities or resources or whatever, we’re going to log a whole bunch of stuff in that moment. Because if, I mean, mostly due to compliance or regulatory requirements, but often because that’s what customers want as well, because if anyone has ever seen any bit of data, they want to be able to track that, record it and analyze it and look for anomalies, et cetera, et cetera, et cetera. So we, we do log a ton primarily for security and just like access controls and safety. But also I do think that it contributes to, I mean there’s setting aside the compliance and regulatory, I’m a big believer in logging a ton, both like log lines but also telemetry metric, like open telemetry system, health system performance, dashboards, dashboards, dashboards for all that alerting and that that’s more of a system health and system stability perspective, but it all comes together in kind of a unified package.
Sam Taggart 00:32:18 Yeah, I’ve got a bunch of questions about logs. So one question I have is, how do you avoid logging sensitive information?
Cody Ebberson 00:32:24 We treat our logs as PHI, so in healthcare Protected Healthcare Information. And so the log data itself, it has the exact same protections and controls as the production database has. And I think this is relatively common in healthcare, you can have a kind of like a subscriber relationship to subsets of logs. So as a service provider, we can pipe the relevant logs to our customers as well. And that becomes part of that kind of handoff agreement about we can either strip all of the protect information. So rather than showing a person’s name or a lab results, it can just be, everything can be distilled down into just identifiers and abstract values like that. And if that’s what they want, that’s what they want. It’s also possible to have hydrated data as well. And that’s more of a question of like what the use case is for the data.
Sam Taggart 00:33:14 Yeah, I would imagine the full data is probably better for developers trying to debug stuff.
Cody Ebberson 00:33:20 I mean, it’s always a trade-off. Yeah. And developers accessing that data is all long conversation all by itself. And I think that there’s a kind of funny irony when it comes to data controls and access that’s like, I think Google, if you think about the access controls that exist for something like Gmail famously like high, high, high bar for protected data so that no one can ever access that data. It’s a major process to, if for some kind of system stability or a bug or an investigation, you do need to get access to that data. There’s a long list of processes and controls you need to go through to kind of break glass and investigate there. I think that the rest of the software engineering world should aspire to get to that level of controls and we certainly have an eye towards that future where everything is at that level of protection. I think itís really the gold standard that we should all be striving for.
Sam Taggart 00:34:12 Yeah. I have another question around logs. How do you ensure immutability for like auditing purposes and things like that?
Cody Ebberson 00:34:19 So first, for our production databases, we are strong advocates for the, like the worm model, WORM, right? Once read many times. So most of our databases are appended only. And so, and quote unquote update operation is really creating a new version of a resource. And so for any given like the, a notion of a patient or the notion of an appointment, that’s really, every single change is being represented as a new version. So you have perfect historical tracking of every change and who made that change? And that’s in the production database itself. And then as that kind of flows through the system and ends up in logs and analytics and data warehouses, that’s more of a process question than it is a strict technical question. But everything is configured to be appended only, right? Only making sure that that data stays perfectly protected and hygienic.
Cody Ebberson 00:35:07 Like some of the compliance programs we talked about earlier, in particular, something like SOC 2, it comes with a list of controls and that translates that into like configurations. For example, inside of our Amazon Web Services environment, there’s various AWS features like Cloud trail, and I think it’s GuardDuty. These are AWS features that you can put various features or services into this lockdown mode. And if anyone tries to change it or make any changes to anything, it’s just going to set up a whole bunch of alarm bells. It’s going to send emails, it’s going to send Slack notifications that like, hey, someone’s in there trying to make a change to one of these protected services. Someone should probably check this if they had access at all in the first place.
Sam Taggart 00:35:50 Okay. And then I guess that would also apply then just to configuration stuff in general. So if you like have some, in your case I think you guys are doing more like web service stuff, but if you had some sort of device that you gave to somebody and it and it and it had some configuration on it, you would somehow validate that as well?
Cody Ebberson 00:36:04 Yes. Like the golden bits model of you, you’ve kind of produced an artifact, whether that’s a binary or a distributor bit of code. And it’s goes through its own validation. Like for, in our case, we have our primary, our web services, we have some on-premises agents for connecting with legacy devices that are not cloud enabled necessarily. And same thing there. And I think there’s a set of industry best practices. You produce the binaries, you produce the artifacts, you also produce the checksums and publish the test reports that are associated with those and just trying to follow best practices across the board on things like that.
Sam Taggart 00:36:35 Great. So we mentioned the CrowdStrike incident earlier and I saw an interesting take and I wanted to get your thoughts on it. And there are a lot of people saying though they should have done more manual testing. And someone else pointed out that if you have manual tests and then something falls through the cracks, the solution is then to add another checkbox. But how do you verify that that checkbox actually gets checked and the work actually gets done versus if it’s automated, you update your script once and then hopefully that should never happen again because it’s now in the script. Do you have any thoughts on that?
Cody Ebberson 00:37:05 I would tend to agree with the belief that it should have been an automated test. I mean, human testing and manual validation have been with us for a long time, and they will probably be with us for a long time into the future. But I think that that’s getting pushed further and further out. And as an industry it’s just better if we embrace more of the automated testing. The automated testing certainly gets more complicated, especially as you start to think about the, in their case that’s a huge number of different devices and network configurations and deployment configurations. But that’s also, I guess part of the job, right? To make sure that you’re getting your services and your code into environments that are representative of the real world as much as possible.
Sam Taggart 00:37:47 So, I guess in the medical and other more highly regulated industries, you try to control that hardware as much as you can if you have the option, I will imagine.
Cody Ebberson 00:37:54 Yeah, absolutely. I mean, the extreme case here would be things like embedded devices like a pacemaker, right? Where you have old school software engineers who are writing code that blurs the lines of application code and operating system code. Embedded devices — because it needs to run for a period of years in a completely automated and offline environment — and that’s probably where some of the most extreme regulations and testing requirements exist: for those embedded medical devices. And then the spectrum kind of goes from there.
Sam Taggart 00:38:25 Great. All right. Well thank you very much.
Cody Ebberson 00:38:29 Thank you so much, Sam. It’s great chatting with you.
Sam Taggart 00:38:31 For SE Radio, this is Sam Taggart.
[End of Audio]