Joe Kutner, Software Architect for Heroku at Salesforce.com, discusses the twelve-factor app. The twelve-factor app is a methodology that aids development of modern apps that are portable, scalable, and maintainable. Host Kanchan Shringi spoke with Kutner about the origin of these principles; their continued and growing importance with advances in microservices, DevOps, and containerization; and why developers should adopt the principles to build modern apps.
- The Twelve-factor website
- Joe Kutner’s Blog
- Joe’s Github
- Joe Kutner on Twitter https://www.twitter.com/codefinger
- Joe’s talk on twelve-factor app
- “Beyond the Twelve-Factor App,” by Kevin Hoffman
- Sample App using twelve-factor principles with Docker and Node JS
- Docker Run
References by Public cloud vendors
- A dozen reasons why Cloud Run complies with the 12-Factor App methodology
- Building Microservices with the 12-Factor App Pattern on AWS
- What about Cloud-Native applications?
- Get your enterprise apps ready for the cloud
- Cloud Native Application Development – A New Computing Paradigm
- Episode 268: Kief Morris on Infrastructure as Code
- Episode 220: Jon Gifford on logging and logging infrastructure
- Episode 375: Gabriel Gonzalez on configuration
Transcript brought to you by IEEE Software
Kanchan Shringi 00:51 Hello, this is Kanchan Shringi for software engineering radio. My guest today is Joe Kutner. Joe is a software architect at salesforce.com working on the Heroku platform. He’s a founding member of the cloud-native bill packs project. This project is based on Heroku technology that helped make 12-factor methods popular. Joe has written three books for the pragmatic bookshelf, including deploying the J Ruby and the LD programmer. Joe has spoken at several conferences on the 12-factor app. Welcome to software engineering radio, Joe.
Joe Kutner 01:28 Hi, I’m glad to be here.
Kanchan Shringi 01:31 Today. We’ll be talking about the 12-factor app, what it is, and how it can help developers and teams building modern applications. I would like to start with asking you, Joe, what is the app in the 12-factor app?
Joe Kutner 01:45 Well, the 12-factor app is a methodology or a manifesto and it was written or published by Heroku back in, I think 2012. So it’s getting, getting pretty old now eight or nine years old. And it described a way of building and deploying applications that was better suited to the emerging cloud technologies at the time because you have to remember before 2012, most of us were deploying our applications on in house data centers. And very few folks were even using AWS at the time. So the, the tools and technologies that we use to build our apps and to put them into production was more reminiscent of the early two-thousands. And so when Heroku came along, they introduced a new paradigm for deployment that’s the platform as a service. And with that came in a new, more modern way of defining your application, preparing it for production, and ultimately for deployment.
Kanchan Shringi 02:56 I see. So did the at Heroku supporting the apps get better after the methodology was published? And could you also maybe give us a couple of examples of what triggered writing the factors?
Joe Kutner 03:12 Yeah. So at Heroku today, we, uh, host, I think more than 10 million applications, we handle about 23 billion requests per day. So we’re seeing a lot of different apps on a platform and we have, you know, just our company essentially operating them. So one of the like real tenants of Heroku, when it was first created, was to be opinionated about how you should use the platform and how you should create your apps to run on the platform. So the 12-factor app was really a part of that set of opinions that we had to encourage our users, our customers, to build their apps in a certain way that worked better on the platform, not only for us to operate them, but also for our customers too, ultimately scale better to have better security and, and better maintainability characteristics.
Kanchan Shringi 04:05 So did that start happening after this was published?
Joe Kutner 04:09 Okay. Yeah, it did. I mean, in fact, the, some of the ecosystems that we supported at Heroku originally was, uh, exclusively the Ruby language and the Ruby on rails ecosystem. We’re very quick to adopt these principles. Uh, they were just well-suited to those technologies, but over time, several years, we’ve now seen that pretty much every ecosystem, whether it’s Java or Python or, or no JS are really embracing these principles. Part of that is because of the initiative that Roku took to push these ideas. But it’s also about just the fact that it, it, led to these better correct characteristics and the end result is yeah. When you deploy an app on a platform like Heroku, and there are other platforms that are just as I’m suitable for 12-factor apps, it ends up being a better experience for both the customer, the user of the platform, and the folks operating it.
Kanchan Shringi 05:12 As you mentioned, it’s been some time since these were published. So how applicable are these now with all the advances we’ve seen with microservices, there were ops and containerization.
Joe Kutner 05:25 I think there’s a lot of things that have changed, but I think the most remarkable thing is how much it’s still the same, how much these principles really still apply. I think many of them, you know, at Heroku, we’ve definitely revisited and, and sort of reconsidered, but at the end of the day, these are still the design principles that we use for the majority of our applications. I think there’s a few areas, especially when it comes to microservices where we’ve loosened up our opinion on, for example, the, um, uh, the first factor, which is code and using source code version control systems, and that you should only have one application per code repository. Uh, I still feel that way, but we definitely internally use mano repos that are, that included many different microservices. So there are times where I think you need to know when to step away from the paradigm and do something a little bit different, but for the most part, it, it holds up even in a world where we’re deploying to Kubernetes with Docker, uh, again, very different from, uh, the cloud ecosystem when Heroku was first created and when the 12-factor app was first published.
Kanchan Shringi 06:39 So would you add more principals at this time?
Joe Kutner 06:43 Yeah, I think so, but 12 is such a good number. I think, uh, there are definitely some principles that if we were written today, I think would be included. One of those is security, and I should say, this is a, the idea of adding more principles really came from a book called beyond the 12-factor app written by Kevin Hoffman. And one of the, yeah, so one of those principles is security. The idea that the security of our applications should come first. It shouldn’t be an afterthought. We shouldn’t tax security onto the, uh, the end of an application. This is, you know, I think not something we considered in the same way, 10 or 15 years ago when HTTPS was not everywhere. You know, now it’s very easy to get a certificate for your application with let’s encrypt, but also things like using role-based access to your resources, uh, so that you can have a good audit trail.
Joe Kutner 07:41 These are things that need to be built into your applications, design upfront. Another principle that I think is really worth including in the 12-factor app is telemetry. And so in the book beyond the 12-factor app, Kevin makes this analogy to a space probe. Like your application is like this craft that you’re sending into space and it’s not coming back to earth. So if there’s a problem with it, when it’s in orbit, you can only rely on the instrumentation that you installed before it was launched. And so we need to think about our apps that run in the cloud the same way. If there’s a problem in production, you need to have the metrics, the telemetry, the tools that you need to debug that right there with it designed, you know, included with it upfront.
Kanchan Shringi 08:29 So we’ll put a reference to the book beyond the 12 factors in the show notes so that readers can reference that too. So my next question would be is building a 12-factor app, easier with specific programming languages or platforms or architectural styles?
Joe Kutner 08:47 Yeah, it definitely lends well to certain language ecosystems, but again, it’s, you know, eight or nine later most ecosystems have, have picked up these principles in the Java ecosystem. I think the vast majority of Java developers are working with spring boot, which is, uh, very intentionally designed to embrace the 12-factor principles. Uh, so everything that’s on the happy path for a spring boot application, it’s just 12 factors. Right? And so that makes it very easy. There are other ecosystems that I think like PHP have not kind of come around fully to these ideas and you have to, you have to embrace technologies that might not be as mainstream as, as others, but for the most part, I think it’s permeated just about every major popular programming language and programming language ecosystem.
Kanchan Shringi 09:42 That makes sense. I personally checked and I saw that all the major cloud vendors actually reference these principles. So we’ll have some of those references in our show notes as well. Could you talk about single-page apps or mobile apps or service lists apps? Do all the principles apply for these or potentially a subset?
Joe Kutner 10:06 Yeah, it depends. I think for things like serverless, I’m a little biased, but I think serverless is not that different from what we had on like on a modern paths for, for a really long time. It’s just sort of a new way of thinking about how you write the code. So yeah, the certainly the, the, the idea of disposability, which is one of the 12-factor principles is that’s really what serverless is all about is that I can dispose of this function or this, you know, this, this unit of work at a whim, you know, I can scale to zero or whatever it may be. And then of course, how you manage your, uh, backing resources and, and your configuration. I think that’s all still very much the same when it comes to single-page apps. It’s definitely a little different. I mean, you don’t have exactly the same concerns about scalability on the backend. You, the idea of processes or disposability might be very different, but certainly how you manage your code, how you manage your configuration. I think all of those things still apply.
Kanchan Shringi 11:09 So you mentioned a few things, configuration disposability, and these are all principles that are outlined in the 12-factor app. Could you maybe just give us the broad areas before, you know, we, in the later in the show, we will drill down drill into some of these principles, but could you just give us a broad area of what all these cover?
Joe Kutner 11:33 Yeah, I think that’s actually very doable because so many of them are interrelated. I think there’s two parts of the factors. One is about how you manage your configuration and your deployment, you know, such that you aren’t hard coding secrets and, uh, such that you don’t have these like snowflake builds that ship to production. And then there’s another set of principles that are more related to like the operational side of running and production. And those are like disposability and stateless processes where it’s really intended to favor an application that is scalable, that can run in the cloud and that you’ll be able to maintain for a very long time. So I think those are the, that’s really the foundation of the 12-factor apps, scalability, maintainability.
Kanchan Shringi 12:23 Okay. Let’s shift gears a little bit now. And let’s talk about the team, what adopting these principles change the development process. So the team composition,
Joe Kutner 12:36 I’m not sure about the team composition. Definitely. There are elements of this that affect how you work on your applications. Uh, one of the principals is dev prod parody, and this is a principle that encourages you to make your development environment as similar to your production environment and your staging environment, all the other environments, uh, as similar as possible. And so for, for a lot of developers, this is different from how we typically do local development. You have an instrumented process that can hot load code, which is still fine, but in the 12 factor way of doing things, you’re going to run it a little bit more like you would run it in production. And so that can be a big shift for people. And I think it’s one of the principles that we again have kind of backed off a little bit on, um, because those, those local development tools that sort of break 12-factor, they really increase people’s productivity. So I think it’s important to be able to run an application locally with, with dev prod parody. But again, you need to know when to eject and use the tools that just help you get your job done.
Kanchan Shringi 13:49 Can the principles, the categorized into mandatory versus very important was says, nice to have. The reason I ask is can a team pick a few of these principles to start? Let’s say if they’re refactoring an app, or is it necessary to implement all of them to see the benefits?
Joe Kutner 14:10 I think it really depends on your deployment target on what the platform that you’re going to run on production supports, right? So on a platform like Heroku, which is very opinionated, you don’t have a great deal of choice. Like this is the way Heroku works. And if you want to do logging a different way, it might just not work. Now. Other platforms give you a little bit more flexibility. I think when you consider something like logging or, um, I think logging is probably the best example, maybe admin tasks, a platform might handle these in a different way. And then you can choose whether you want to log to standard as your sort of like an event stream. Or if you want to log to files and have something rotate them, you could also choose to rather than running admin tasks in their own container, you know, creating a shell into a container and running them there. So if the platform gives you the ability to do those things, yes, you have a choice. But I think most platforms that are similar to Heroku comes to mind is like cloud Foundry and, and Google’s app engine they’re essentially requiring the same kinds of things, the same principles.
Kanchan Shringi 15:26 So it sounds like the team would need to decide based on the platform and maybe prioritize based on what the platform occurs, but implementing even a few is beneficial. Is that fair?
Joe Kutner 15:38 Yeah, I think that’s fair.
Kanchan Shringi 15:40 Okay. We’d like to now drill into some of the factors. It may not be possible to cover all of them. So let’s start with the ones that I either found hard to understand are, seem more important. So let’s start with the declaration and isolation of dependencies. Could you describe this one.
Joe Kutner 16:04 Right? This principle States that your dependencies, the dependencies of your application should be explicitly declared and managed. And so I think the best way to understand this principle is to look at the inverse, which is where you have dependencies that are expected to be on the host, or you’re not defining as part of your application. You’re defining them somewhere outside of your application. So for example, the runtime that executes your code, whether it’s a JVM or a node JS runtime, if you don’t have an explicit version and you’re not managing how that is installed, it becomes just this like global dependency. That’s not well-managed in the Java ecosystem. Uh, we also see people that are, uh, you know, checking compiled binary jar files into their source code repository, uh, rather than using the proper dependency management systems like Maven or Gradle to include those in a way that’s both a better development experience locally, but also easier to manage as part of your CIC D process.
Kanchan Shringi 17:13 Would you have examples of dependency management for some other languages other than Java?
Joe Kutner 17:19 Yeah. I think if you’re using MPM and you’re using your package, Jason for a no JS application, as long as you are, including all of the things that your application needs to run in production, as part of that dependency management solution and MPM, then you’re abiding by this principle. And it’s also okay to combine technologies. We see a lot of applications that have a Ruby gem file and a package dot JSON, or they’re, uh, they have a Palm XML for, and a package, Jason, because they have some fun end assets that need compiling or, um, you know, preparing for production. So every ecosystem has some dependency management solution for this. And it’s really about using that the way it’s supposed to be used rather than depending on, you know, the implicit presence of some execution engine that, that you’re going to call out too, or something like that.
Kanchan Shringi 18:16 What can happen if the development team does not pay attention to this principle?
Joe Kutner 18:22 Well, certainly you can run into cases where you expected one version of a dependency to be present, and it was not, there was some other version and then you get, you know, an interface in CA incompatibility at runtime. Yeah, I guess in the worst case scenario, that dependency you expected to be, there is not there and things, things break pretty quickly, but I think it, it also leads to problems with security, uh, in terms of being able to audit what is actually running in your production environment. And sometimes in your, uh, again, your CIC D process, how do you manage that? The deployment target has all of the things that you need in order to get the application working.
Kanchan Shringi 19:03 So how does a team validate our be sure that they have actually followed this principle?
Joe Kutner 19:09 No, uh, there are there, I don’t know if there’s a really good way. Um, I think this is one of those problems that many people have tried to solve by instrumenting the environment to see if you’re calling outside of the application. But I, I think it’s, I think it’s a real challenge to, to correctly verify this. And that’s why we see a lot of security folks working on mechanisms for image scanning to see if they can just catalog everything that’s in the image. It’s a difficult thing to know if you have some dependency that you haven’t really captured well, so I’m not sure if there is a good way.
Kanchan Shringi 19:47 So you mentioned image. Can you define that a little bit further?
Joe Kutner 19:51 Yeah. When I say image, I mean, essentially a Docker image or a more generically and OCI image, but the same principles are true for any kind of image format, which could be a virtual machine image, or even as simple as a tarball in some cases. But the process that I was referring to was some tool that can scan the image and inspect what is inside of it and catalog that. So you can audit it.
Kanchan Shringi 20:19 Let’s talk about the configuration principal would say, is store configuration in the environment. Why do that?
Joe Kutner 20:27 Right? So the main purpose of this principle is to encourage you to decouple configuration from the application. So I think the principle States specifically that you should not check your secrets, your passwords into your version control system. And I know most would not check their own personal password, or at least I hope they wouldn’t into there get repository, but many of us have checked in an AWS token or a database password and doing this creates two problems. One, the source code repository is not encrypted. Uh, so when you store it somewhere, even if it’s on a private server, it’s still stored in cleartext. So it’s a security problem, but it’s also a problem for portability because when you start to couple your configuration, whether it’s credentials or, and you know, some of their aspect of, of how the application executes when you couple that to the application, it then becomes difficult to move that application to a different environment without actually changing the code. So in the worst cases, you know, changing your database requires making a commit and then deploying that commit, which greatly complicates the process. If you have, for example, a security problem with your database and you need to remediate it by rolling the credentials. Well, now you have to deploy your application rather than just changing your configuration. And that can be very problematic.
Kanchan Shringi 21:59 Are there any recommendations about handling sensitive information like certificates or API keys? Are they handled differently than other configurations?
Joe Kutner 22:08 Yes. It depends on the type of configuration. I think in the 12-factor app, it’s explicitly States that these things, whatever they are, should be stored as environment variables. And this is the way Heroku works, even for certificates and things like that. We’ll often store them as a text string in an environment variable, and then some other process we’ll take that and write it to disk or, you know, do whatever it needs to be done. Um, we see this with like Java key stores that just, they’re not really, they’re an old specification and they’re not really designed to, to work well with environment variables, but as time has gone on, and we have new technologies like Kubernetes and Docker, the need to put these secrets, uh, or to store them as environment variables has kind of changed. And we’re starting to see new mechanisms where file mounts can be used to sort of inject secrets as files into the container.
Joe Kutner 23:05 Uh, and so this opens up a lot of other opportunities, a lot of other possibilities for how you store your secrets, how you store your configuration, how you consume them. Um, one of the advantages that storing them as files has is that it makes it possible to, uh, reload them without restarting your application. And this was always kind of a problem with environment variables is you start a process, your application process in an environment, and it has a database URL or an AWS token environment variable, and then it has it. And that’s it. And there’s no way to swap that out without just restarting the process, but by storing them as files, it gives us the opportunity to hot reload, a new database connection, or something like that.
Kanchan Shringi 23:53 Is this one easier to test for that? You have actually followed it?
Joe Kutner 23:57 Yes, I think so. Uh, the litmus test for this principle is if you can open-source your code at any moment and not compromise the credentials to your systems, then you are likely satisfying this principle now, determining that is non, you know, not necessarily simple, but there’s some great tools that you can use with like get to a, as a pre-commit hook. And it’ll scan your code to check for things as simple as like password equals and then to see if there’s an actual value there. So there’s some tools that, uh, you know, a lot of it teams will recommend you have installed on your, on your local development machine to make sure that even by accident, don’t, you know, don’t get those secrets into your version control system.
Kanchan Shringi 24:46 Okay. Let’s move on to the next one, which is treating backing services as attach resources. I wasn’t gonna pick this, but a, you mentioned backing services earlier. So I’d like to drill into that. Can you give us some examples of water backing services?
Joe Kutner 25:04 Yeah. Backing services are any service that you need your app to consume as part of its operation. So this could be a database, it could be a caching service. It could be an APM that’s instrumenting your application and collecting metrics. It could be a service that you use to send emails. It could be a service that you use to collect social media data. These are all examples of backing services, and they’re all essentially external to your app. They’re external resources that are likely out of your control, or in some cases, you’re choosing to be out of your control, like a database, because you want to decouple the application from it. And for each of these systems, for each of these backing resources, we want to treat them like they are just that sort of attachable and detachable components that we can plug and unplugged. And what this allows us to do is swap them in and out. If we need to, for example, if you are having a database that is unhealthy and it needs to be backed up and replaced with a follower database that mirrors the data, you can do this by creating the follower database, setting up the new environment variable or, or secret on the file system with the, the database URL, and then having the app reload that, or, or restarting the application. But you don’t have to then commit to your code or, or do anything else like that. You’re just swapping out this like pointer to the service.
Kanchan Shringi 26:40 So the backing service could be something external or something that was developed by the team or related teams. So is it fair to say that a backing service itself could be a 12-factor app?
Joe Kutner 26:52 Oh, absolutely. And I think a lot of microservice architectures treat their services this way. They might even be developed by the same person, right. But there’s a need to decouple them for whatever reason. Maybe they’re doing different jobs. Maybe they have different scalability requirements. Um, and you want them to separate things, but you still want that ability to replace them with a different implementation or to have a failover instance or something like that.
Kanchan Shringi 27:51 Okay, next one, we will pick a stateless process. It states that you should execute the app as one or more stateless processes. What does that mean?
Joe Kutner 28:02 This means that there shouldn’t be anything held in memory by the application that needs to exist beyond the life of that process. So the best example is sticky sessions or sometimes called session affinity, where a user who is browsing a website and has a session that’s stored on the server-side, that it needs to be stored in the memory of a single process. And this is really a, a matter of scalability again, because if a single user needs to return to that process in order to get their session, it means that you’re going to have challenges, distributing your users across the many processes that you have. It also means that you can’t have a disposable process, right? So if that’s process contains a user session and a user may be using your website or shopping Carter e-commerce website for half an hour, then that process, if it shut down we’ll to grade their experience.
Kanchan Shringi 29:07 So the dateless processes reads as execute the app as one or more state, less processes. Why multiple, why more? What is, what is the significance of having the application be more than one process?
Joe Kutner 29:22 Well, this is really about scalability. If your application is a single process, you might not mind having some memory or some session or some other data in the memory of that process. But as soon as you scale out to two processes, you now have to figure out a way to share that memory, the two processes, because otherwise, it’s local to that container or to that machine. So the reason that we want stainless processes is to sort of decouple the parts of our applications, memory that needs to live beyond the life of that process so that we can use it from other processes. So you can imagine a user on a website has a session. And when they come back to the website, a couple of minutes later, they might, their request might be handled by a separate process. But that ID that represents their session can still be used to go retrieve their session data from some external thing that’s holding on to that state and that memory. And usually, that thing is a backing service, a Memcached or a Reddis
Kanchan Shringi 30:28 Moving on to port binding. It says export services. Why up port binding? Could you clarify?
Joe Kutner 30:37 Yeah, I think this one is confusing because you have to understand what we were doing with our applications before this principal came around. So the best example is in the Java ecosystem, whereas far back as like the 1990s, we had these application servers that we would start up and they would bind to a port and they would do the work of handling a request. But the actual business logic was contained in a unit of distribution called a war file that we would drop into that running application server. And then the application server would kind of Curry the requests into that web application that was contained in the war file and it would do something. And then the idea was that we could swap out that war file with a new one, reload it hot, deploy it, but that never worked because of memory issues and crashes and whatever else.
Kanchan Shringi 32:47 So you had mentioned the book beyond the 12-factor app in that Kevin Hoffman says that this one may be a little difficult to build depending on the infrastructure or the type of application he says what’s most important is that there should be a one to one correlation between the application and the application server. Would you agree with this simplification?
Joe Kutner 33:15 I think I probably would not. If I think I know what he’s talking about, he’s, I think he’s giving some room to use those application servers. And again, this is, this comes out of the Java ecosystem primarily, and that model worked well for people and they, they didn’t want to change it, but in a, in a world of containers and Kubernetes and cloud-native applications, I think the, uh, the self-contained processes is really what’s, what’s important. So he does mention the one to one mapping. I think that’s a characteristic of these application servers. Uh, it was very common, uh, in the application server world to have one port and then a whole bunch of applications behind that. And there’s some kind of router that was like saying this kind of request at maybe this path or whatever it may be, would be routed to this application. So again, that, that starts to, you know, muddy the waters of portability and dev prod parody, where it’s difficult to reproduce that environment and other places.
Kanchan Shringi 34:24 The next factor is concurrency would say scale-out why or the process model that sounds quite related to what we talked earlier about processes. Is it just building upon that?
Joe Kutner 34:37 Yeah, I think it is related, especially in a pre cloud-native world where we were managing our own servers where resources were limited. Uh, you know, in order to scale up, you needed to buy a new server. We had a tendency to scale by making bigger and bigger processes with larger thread pools and more memory. And, uh, many of the technologies that we use, like the JVM are very good at this, but we also need to consider that some types of applications need to be able to scale out more processes, more instances of your application rather than bigger processes, but also in creating different types of processes that do different jobs. So you might have a pool of processes that are doing some kind of background work, like worker processes, and then a pool of processes that are handling just web requests. This allows you to scale them independently.
Joe Kutner 35:34 You may need hundreds or even thousands of instances to handle your, your web requests. But if you have a nightly job, you only need one or two processes to do that. And when we were building these kind of monolithic applications where we’re relying on bigger processes with lots of thread pools, it becomes difficult to separate those different jobs and scale them independently. So that’s what this principle is encouraging is to split those jobs up, run them as processes again, in the cloud, the containers, the processes are relatively cheap compared to that old way of, of doing things
Kanchan Shringi 36:15 Def broad parody. I think I’ve heard in couple of your talks, you mentioned that you believe this is the most important. Is that still true?
Joe Kutner 36:24 Yes, it’s true. I think this is one of the most important principles. Unfortunately, it’s also a lie and it’s, it’s not possible. I think any, any more, maybe that prod parody is the right spirit, but I think it’s inevitable that there are going to be differences. And your, especially in your local development environment, you know, the spirit of this principle is still true. You should strive to be capable of running your application locally, the exact same way that you run it in prod and staging. This is important because it allows your app to be more portable, which allows you to like reproduce new environments. So you can stand up new environments. You can get new developers started very quickly, uh, because if you can stand up a new staging environment very quickly, why not get a new developer ramped up? And all of that leads to another one of the factors, which is disposability, because if you can stand these up very quickly while you can dispose of them and when you don’t need them anymore. So that’s why I feel that it’s this sort of linchpin for the whole methodology, but at the same time, you know, I really like it when my local development process, hot loads, my code changes, and I don’t have to do anything clever to restart it or wait for that whole restart cycle to load my code changes. I there’s nothing wrong with that, but I do think the ability to have dev prod parody, maybe even if you’re not using it every step of the way in your development process, it’s still important.
Kanchan Shringi 38:04 Maybe one of the things that makes this difficult to do is data. At least that’s definitely been my experience.
Joe Kutner 38:13 Oh, the databases and the backing services essentially, right.
Kanchan Shringi 38:17 Not just the backing service, but the actual data. Is that important?
Joe Kutner 38:22 Oh yeah. I see what you mean. I, yeah, it’s, it’s hard to, you know, especially reproduce certain cases without the right kind of data. I never recommend connecting to your production database from your local development environment. I want to make that very clear, but a common pattern that I do use is to have a sort of remote staging environment or rather a remote development environment. That’s very similar to my staging environment. And so very often when I’m doing development, I don’t even have my database running locally. I’m actually connected to a cloud database. And in that case, it’s sometimes easier to have either a very large dataset that is reminiscent of the production data, or just to have systems that are handling kind of like real-world transactions and whatever it may be to get data that is, you know, more typical of, of what will actually be there. And in production.
Kanchan Shringi 39:24 As far as dispensable, the recommendation was the developer should write code and ideally it should be deployed, you know, hours or minutes later. And if you don’t pay attention to that, that’s a problem. If the gap is long. And also it also talks about in terms of who deployed the code, was it a completely different DevOps engineer or the developer himself? Can you talk more about that?
Joe Kutner 39:53 Yeah, I think when there is a lack of dev prod parody, there is a tendency to deploy less frequently, or even to decouple the deployment process from the person who developed the code and what that results in is a very long iteration, a very long cycle between writing the code and then actually verifying that it’s working in production. And so part of what dev prod parody strives to do is tighten that cycle, tighten that loop. So that as developer, after you write code, when you run it locally, because of dev prod parody, do you have a high level of confidence that what you’re running is exactly or very similarly the same as, as what’s running in production. And that allows you to verify that your code is working, uh, validate that you don’t have any bugs or very few bugs, and ultimately allow you to be more productive, allow you to have a high level of confidence that the code you’re writing will work in production.
Kanchan Shringi 40:59 So there’s no mention of automation in the 12 factors, but to reduce these gaps, automation of the deployment seems very, very critical. Is that right?
Joe Kutner 41:11 Yeah. I think a lot of, even though it’s never called out explicitly, I think a lot of the 12-factor principles really lend to continuous deployment, continuous integration, where these processes are, are just happening. Right. You push your code to get hub it’s automatically picked up and compiled or packaged, created, and built into an image that’s then executed on some container environment. Again, just kind of closing the loop very quickly on, on your development cycle. I, you know, I, I don’t remember. I, I kind of wonder if when the 12-factor app was written, if the CIC D you know, moniker or buzzword was, was used at the time, but I suspect it just, wasn’t something that was talked about in the way that we talk about it now, and similar to many other technologies that have come around, like, like Docker that are inspired by these principles. So that I think they’ve even when the 12-factor app doesn’t explicitly call out a connection to Docker containers or CIC D it’s the spirit of these principles that led to the way that we do the things we do.
Kanchan Shringi 42:26 Let’s talk about logs. Now, a principle is treat logs as event streams. Can you elaborate?
Joe Kutner 42:38 Yeah. The, the best way to think about this is how you would do it otherwise. Um, so it’s very common to write your logs to files, uh, that then need to be rotated and moved off of disc or, or whatever it may be. And in that model, you have this operational burden for dealing with those log files. But what the 12-factor app encourages is this sort of ephemeral logs flow through standard out and are captured by some other system that you don’t care about, right? Like the container is running on a platform, and that platform knows how to capture the output of your application and send it somewhere, whether it’s to a system that knows how to back up your logs, or maybe it’s to a system that knows how to parse the logs for metrics, information, or other key markers that you can use to monitor the application. And so that’s where the treating it as an event stream, uh, really comes in by treating the logs as events. It kind of decouples you from the various jobs that need to be done with those events, whether it’s again, collecting metrics or storing them, or scanning them or searching them.
Kanchan Shringi 43:57 Is that also related to telemetric? You mentioned that was something you would add now.
Joe Kutner 44:02 Yeah. I think that’s a type of telemetry and certainly, we’ve in the past at Heroku encouraged using your logs as a vehicle for sending metrics to other systems. However, it’s not the only way to do that. And there are times when you have a lot of metrics that it puts a strain on your logging system. And so sending metrics through logs, you know, has problems of its own. And internally at Heroku, we’ve debated whether this is the right thing to do or not. And I think we’ve kind of gone back and forth, but there are definitely other ways to collect metrics. Um, there are systems like Prometheus’s, uh, that are part of the cloud-native ecosystem that I think provide a better mechanism than actually putting the metrics into your logs. But certainly, your logs are one of the tools you have for telemetry. Um, they’re very useful for security purposes. For example, uh, I often recommend, uh, logging, uh, security markers for any kind of like privilege escalation or account login failure or anything like that. It’s a, it’s one of the best ways to get an early heads up on a problem.
Kanchan Shringi 45:14 So the focus for logs is more on the aggregation. Is that fair to summarize that principle?
Joe Kutner 45:21 Yes, I think so. The aggregation, which leads to searching storage and other responsibilities.
Kanchan Shringi 45:29 So they’ve actually covered almost all the principles except the first and the last. So let’s just complete it. The first one says one codebase tracked and revision control with multiple replies. So the code base and revision control certainly seems like a no brainer, but can you talk about the mapping between the recommendation to have a separate repository for each app?
Joe Kutner 45:55 Yeah. There’s two ways you can look at this one is that you will occasionally see patterns where developers are sort of forking their repository for different environments, that they would deploy to a, so you kind of have like code specific changes that are mapped to those environments. And that’s something it’s fairly uncommon, but that’s something you want to avoid. I think what this principle is really kind of driving against as the mono repo pattern, where you have multiple applications in the same code repository, and then they’re deployed to different places. And this is something that, again, internally at Heroku, where there’s, even though we wrote the 12-factor app, we don’t necessarily agree on this. And we do have some, a monitor repos that we use internally. Personally, I avoid the mono repo pattern. I find that you end up having commits that are tied to multiple applications within that repository, which sort of entangles their history. And it makes it difficult to roll back versions of an app and without messing something else up. So it’s generally something I avoid, but I definitely acknowledged that, uh, there are plenty of use cases where it makes sense.
Kanchan Shringi 47:13 Okay, let’s talk about the last one now, which is related to running admin processes. So this one is confusing. Why would you have one-off admin processes?
Joe Kutner 47:27 The purpose of the one-off admin process is isolation. And I think this comes back to thinking about your processes as being disposable, they’re cheap, right? Containers are these lightweight things that we can spin up without a lot of resources. So let’s take advantage of that because before we had containers, we would SSH or log into a machine to perform a database job or a database migration or something like that. And this has definitely happened to me. I’ve logged into that machine. I’ve innocently run a database migration, and then accidentally killed the process that was running in the same machine that was serving real requests, right? And we can avoid that mistake simply by putting each job, whether it’s the web process or the worker process, or the administrative task, that one-off task for database migrations, put it in its own container. They’re cheap.
Joe Kutner 48:24 They don’t require a lot of resources. Uh, we have that image that we created because of how we built our application and these immutable images. And we might as well take advantage of that. And if you’ve used technologies like Docker, then you’re already familiar with this. Every time you use Docker run to start a new container, that’s a one-off task. And, uh, that Docker run command really comes from the 12-factor app because this is the model that Heroku started with its Heroku run command, uh, really starting to popularize containers. But now you can have that experience on your local machine and your cluster or wherever it may be.
Kanchan Shringi 49:04 So one of the recommendations around testing these admin processes, these one-off admin processes, is there a process to test them in stage? Is there a recommendation around that?
Joe Kutner 49:16 Well, I guess by definition, they’re kind of these unique little operations that need to be run. I certainly recommend having tests for the jobs that you want to run. I don’t like, I don’t think this is this principle is trying to encourage you to create a session where you can have an interactive console and just kind of poke around at your production data, like be careful there, but if you have a rake task or some other, you know, script that is well tested and is meant to be run in its own isolated process to perform that administrative task, I think that gives you the, um, the appropriate mechanisms to ensure you’re not going to do something terrible.
Kanchan Shringi 50:04 To me, it sounds like this principle of just admitting and being pragmatic, that there will be a need for these one-off and men processes. And if there is then follow these principles.
Joe Kutner 50:16 Yeah, I agree.
Kanchan Shringi 50:19 So now that we have covered the 12 factors, do you want to take another stab at why developers should adopt these?
Joe Kutner 50:27 It really comes down to scalability and maintainability by following these principles. Uh, you’re, you’re just inherently going to create applications and architectures that are more resilient to changes in web traffic or changes in load. And that’s going to make it easier for you as a, as a developer, as an operator to maintain that application. When you need to roll out security patches or do database maintenance, if you’re following these principles, the operational burden that you experience will be reduced. And I think the best way to prove that is to look at the platform like Heroku. We host more than 10 million applications, and we have a couple hundred engineers running the platform. I don’t think that would be possible if it wasn’t for the fact that every single app on this platform is abiding by these principles.
Kanchan Shringi 51:25 Thanks Joe, on that. So my last topic is really community. How can folks learn more about the 12-factor app?
Joe Kutner 51:36 Well, there’s of course the 12 factor.net website, which has the original manifesto. Um, but nowadays there’s, there’s many sample apps and even many of the frameworks that you’re going to use are going to lend to these principles. Uh, just naturally I mentioned before, uh, spring boot, just embracing these principles sort of out of the box, but there’s also the underlying technologies because they favor it. So for example, Docker run, I mentioned, and some of the other Docker commands really map well to these principles. So, uh, I think the Docker documentation has some, uh, examples, like a node JS example, which you can put a link to in the show notes. Those are probably the best way to get started. Actually see what a 12-factor app looks like. And I think you’ll find that it looks like most other apps. It’s really about how you operate on it and what development processes you use to, to change it and to, to deploy it.
Kanchan Shringi 52:35 We’ll definitely put a link to the app that you point to in the show notes, as well as a link to your blog. Is there a way for people to contribute if somebody wants to do so further to the effort of popularizing these factors
Joe Kutner 52:50 For, for anyone that’s passionate about encouraging this pattern, certainly link back to the, the 12 factor.net website. I think it’s a great topic for someone who is either a first-time conference speaker who wants to give a conference talk because there’s, there are a lot of materials out there to learn from. And I feel like it’s the kind of topic that a lot of both new developers and old developers like myself need to hear because these principles, they seem to be timeless in a way like granted eight or nine years is not an eternity, but in tech, that can be a pretty long time. So I think it’s a great topic for people to take a stab at practicing their blogging skills or conference speaking skills and share the, share the ideas of the 12-factor app.
Kanchan Shringi 53:41 How can people get in touch with you?
Joe Kutner 53:43 So I go by code finger on most social media and internet places except get hub where I am. Jay Cutler, happy to field questions there, or, uh, any other place you can find me whether it’s IRC or Slack.
Kanchan Shringi 53:59 Well put, we’ll put that in the show notes, Joe, thank you very much for coming on the show and talking about the 12 factors.
Joe Kutner 54:06 Thank you for having me.
Kanchan Shringi 54:07 This is Kanchan Shringi for software engineering, radio
[End of Audio]
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected].