SE Radio 477: Josef Strzibny on Self Hosting Applications

Josef Strzibny the author of Deployment from Scratch discusses how and why it’s valuable to learn how to self host applications. Host Jeremy Jung spoke with Strzibny about why self hosting can be overwhelming; choosing a linux distribution; Security Enhanced Linux; systemd; running background services; installing language runtimes; where to put application files; why his book uses shell scripts for deployment; managing credentials; reducing dependencies; why email and error reporting should use managed services; and incrementally adopting containers.

This episode sponsored by SignalWire.

Show Notes

Transcript

Transcript brought to you by IEEE Software
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected].

SE Radio 00:00:00 This is software engineering radio, the podcast for professional developers on the [email protected]. Se radio is brought to you by the I triple the computer society. I believe software magazine online at computer.org/software

SE Radio 00:00:19 Signal. Why a real-time video technology allows you to create interactive video experiences that were previously impossible. Signal wire gives developers access to broadcast quality ultra low latency video for everything from video collaboration tools for film and TV studios and fortune 500 enterprises to engaging virtually fence, they can even assist with one of a kind fully interactive virtual concerts. See why the future of video communication is being built on signal wire. They’re easy to deploy APIs and SDKs are available in most popular programming languages. Signal wire is a complete unified platform for integrating video as well as voice and messaging capabilities into any application. Try it [email protected] and use code se radio for $25 in developer credit. Go to signal wire.com that’s signal wired.com and use code se radio to receive $25 in developer credit. Today.

Jeremy Jung 00:01:06 This is Jeremy Jung for software engineering radio. Today I’m talking to you he’s the author of the book deployment from scratch a fedora contributor, and he previously worked on the developer experience team at red hat USIP. Welcome to software engineering radio.

Josef Strzibny 00:01:22 Thanks for having me happy to be here.

Jeremy Jung 00:01:25 There are a lot of commercial services for hosting applications. These days, one that’s been around for quite a while is Heroku, but there’s also services like render and Netlify why should a developer learn how to deploy from scratch? And why would a developer choose to self host an application?

Josef Strzibny 00:01:46 I think that as engineers and the can engineers, we should know a little bit more how we run our own applications. That you’ve right, but there’s also a business case, right? For a lot of people, this could be saving money on hosting, especially with manage the basis that can go high in price very quickly. And for people like me that apart from daily job have also some side project, some little project they want to start and maybe during to a successful standup, you know, but it’s at the beginning, so they don’t want to spend too much money on it, you know, and I can deploy and serve my little projects from $5 virtual private servers in the cloud. So I think that’s another reason to look into it. And business wise, if you are, let’s say it’d be good team and you have the money.

Josef Strzibny 00:02:43 Of course you can afford all these services. But then what happened to me when I was leading a startup, we were at some flair and people were coming and asking us, we need to sell those their application. We don’t trust the cloud. And then if you want to prepare this environment for them to host your application, then you also need to know how to do it. Right. I understand come to see, get the point of not knowing it because already backend development can be huge. You know, you can learn so many different databases, languages, whatever, and landing also operations and servers. It can be overwhelming. I want to say you don’t have to do it all at once. Just, you know, learn a little bit and you can improve as it go. We will not learn everything in the day.

Jeremy Jung 00:03:33 So it sounds like the very first reason might be to just have a better understanding of how your applications are, are running. Because even if you are using a service, ultimately that is going to be running on a bare machine somewhere or on a virtual machine somewhere. So it could be helpful maybe for just troubleshooting or better understanding how your application works. And then there’s what you were talking about with some companies want to self host and just the costs as well.

Josef Strzibny 00:04:07 Yeah, for me, really, the primary reason would be to understand it because, you know, when I was starting programming, oh, well, first off there was PHP and I host some shadows thing to some SPBD right. And they will host it for me and it was fine. Then I switched to Ruby on rails and at the time people were struggling with deploying it and I was asking myself, so, okay, so you’re on rails S like for server, right. It starts in development, but can you just do that on the server for your production? You know, can you just real server and is that it, or is there more to it, or when people are talking about Linux hardening, I was like, okay, but you know, your lens distribution have some good defaults, right. So why don’t you need some further hardening, but what does it mean? What’s a chance. So for me, I really wanted to know the reason I wrote this book is that I wanted to like double down on my understanding that I go to drive. Yeah.

Jeremy Jung 00:05:09 I can definitely relate in the sense that I’ve also used Ruby and Ruby on rails as well. And there’s this huge gap between just learning how to run it in a development environment on your computer versus deploying it onto a server. And it’s pretty overwhelming. So I think it’s really great that you’re putting together a book that really goes into a lot of these things that I think that usually aren’t talked about when people are just talking about learning a language,

Josef Strzibny 00:05:40 You can imagine that a lot of components you can have into this application, right? You have one database, maybe you have more databases. Maybe you have a registry key value store. Then you might have low balances, all that Jess. And I just want to say that there’s one thing I also say in the book, like, try to keep it simple. If you can just deploy one server, if you don’t need to fulfill some S L E uptime, just do the simplest thing first, because you will really understand it. And then there was an error. You will know how to fix it because I didn’t make things complex for you. Then it will be kind of lost very quickly. So I try to really make things as simple as possible to stay on top of them. I think

Jeremy Jung 00:06:25 One of the first decisions you have to make when you’re going to self host an application is you have to decide which distribution you’re going to use. And there’s things like red hat, and you’re going to, and Debbie and all these different distributions. And I’m wondering for somebody who just wants to deploy their application, whether that’s rails, Django or anything else, what are the key differences between them and how should they choose a distribution?

Josef Strzibny 00:06:54 If you already know one particular distribution, there’s no need to constantly be on the hunt for a more shiny thing. You know, more important that you know it well, and you are not lost. That said there are differences, you know, and to be long placed from goals and philosophy to mix it with your community, your company, if it’s showing distribution or not lack of support, especially for security updates, the kind of in systems that is used to kind of see library that is used packaging format, package manager, and for what I think most people will carry about number of packages and the quality of version, right? Because essentially the distribution is distribution of software. So you care about the software. If you are putting your own stuff on top of it, you maybe don’t care. You just care about it being a distribution and that’s it, that’s fine.

Josef Strzibny 00:07:51 But if you are using more things from the distribution, you might stay, start getting a little bit more, you know, other fingers. So maybe a support for some mandatory access control or in the, you know, verbal Docker, maybe the most minimal image you can get in start with, because you will be building a lot of, a lot of times the Docker image from the Docker file. And I would say that two main family of systems that people probably know based on fedora and those beds on the beyond, right? From Federer, you have a now Linux and the be on site. You have, you won’t do, which is maybe the most popular cloud distribution right now. And of course, as a federal packager, I’m kind of in the federal bar. Right. But if I can mention two things that I think makes sense, or like our advantage to federal based systems, and I would say one is modeler packages because it’s traditional systems for a long time or for only one version of particular component, like, or will be for one big version.

Josef Strzibny 00:09:03 So that means either it worked for you or it didn’t, you know, with databases, maybe you can make it work with Ruby and patent versions. Usually you start looking at some version manager to compile around version because the version was all old or, you know, simply not the same, the one your application use and with modeler packages, this changed and now in fedora and travel and all this, we now have several options to install. There are like four different versions of , for instance, you know, four different versions off, but also different versions of Ruby. Bison, of course still, you don’t get all of the versions you want. So for some people, it still might not work, but I think it’s a big step forward because even when I was working at gen hat, you were working on a product called softer collections. This was kind of trying to solve this thing for enterprise customers, but I don’t think it was particularly good solutions.

Josef Strzibny 00:10:04 I’m quite happy about this modularity effort, you know, and laugh in the middle of packages. I look into them recently are, are very better, but I will say one thing don’t expect to use them in a way, use your regular version manager for development. So if you want to be fishing between versions of additional projects, there’s not the use case for them, at least as I understand it, not for now, but for cyber that’s fine. And the second good advantage of federal by system, I think is good initial S E Linux profile settings. You know, as the Linux security enhanced Linux, what is really is, is a mandatory access control. So unusual distribution, you have a discreet permissions that you set that user centered itself on their directories and files, but this mandatory access control means that it’s kind of a profile that is there beforehand, the administrators prepares, and it’s kind of orthogonal to those other security on there, if you have there.

Josef Strzibny 00:11:10 So that will help you to protect your most vulnerable criticism, because especially with a C Linux, there are several molds. So there is, uh, MRI mode for like that, maybe an army of abuse, you know, but for what we use looks like the default, it’s something called targeted policy. And that means you are targeting the vulnerable processes. So that means your services that you are exposing to external vote, like whether it’s SSH, who’s going to sequel engineering, X, all of those things. So you have a special profile then. And if someone, some ticker takes over of your one compliant, one process, they still cannot do much more than what the complainant was kind of prepared to do. I think it’s a good that you have this high quality settings already made because other distributions, they might actually be able to run with SEL Linux, but they don’t necessarily provide you any starting points. You would have to do all your post-its yourself. And I see Linux is actually a complex system. You know, it’s difficult. It’s even difficult to use it as a user. Kind of, if you see some tutorials for SendOwl, as you will see, a lot of people mentioned Excel in those, maybe even turning it off, there’s this struggle, you know, and that’s why I also use like one big chapter on ICL in Knox to get people more familiar and less scared about using it and running with it.

Jeremy Jung 00:12:42 So, se Linux is, it sounds like it’s basically something where you have these different profiles for different types of applications. You mentioned SSH, for example, maybe there could be one for engine X or, or one for Postgres. And they’re basically these collections of permissions that a process should be able to have access to whether that’s network ports or file system permissions, things like that. And they’re kind of all pre-packaged for you. So you’re saying that if you are using a fedora based distribution, you could say that I want SSH to be allowed. So I’m going to turn on this profile, or I want engine X to be used on this system. So I’m going to turn on this profile. And those permissions are just going to be applied to the process that needs it. Is that correct?

Josef Strzibny 00:13:32 Well, actually in the base system, there will be already a set of basic things that are loaded, you know, and you can make your own Alison models that you can load, but essentially it works in a way that what’s not really permitted and allowed is disallowed. That’s why it can be a pain in the ass. And as you said, you are completely correct. You can imagine it as some engineering acts as a reverse proxy, communicating with Puma application server via Unix socket, right? And now engine X will need to have access to that socket to be even being able to write to a Unix socket and so on. So things like that. But luckily you don’t have to know all these things, which is difficult, especially if you’re starting out. So there are set of tools and utilities that will help you to use SELinux in a very convenient way.

Josef Strzibny 00:14:27 So what you do without you to do is to ground us in Linux, in a permissive mode, which means that it locks any kind of violations that application does against your base system policies, right? So you will have them in the lock, but everything will works. Your application works. We don’t have to worry about it. And after some time running your application, you were under securities to analyze these logs and these violations, and they can even get generate a profile for you. So you will know, okay, this is the profile I need. This is the access, the things I need to add once after you do that, if there will be some problems with your process, if some article will try to do something else, there will be denied detection. But because of the facilities, you can kind of almost automate how you make a profile and it’s much easier.

Jeremy Jung 00:15:24 So basically the operating system, it comes with all these defaults of things that you’re allowed to do and not allowed to do you turn on this permissive flag and it logs all the things that it would have blocked if you were enforcing se Linux. And then you can basically go in and add the things that are, that are missing.

Josef Strzibny 00:15:45 The

Jeremy Jung 00:15:45 Next thing I’d like to go into is one of the things you talk about in the book is about how your services, your application, how it runs as daemons. And I wonder if you could define what a Damon is.

Josef Strzibny 00:16:00 Uh, you can think about them as a background processing or something that continuously runs in the background. Even if the virtual machine goes down and you reboot, you just want them again to be restarted and just run at all times, the system is running.

Jeremy Jung 00:16:19 And for things like an application you write, or for a database, should the application itself know how to run us off in the background, or is that the responsibility of some operating system level, process manager?

Josef Strzibny 00:16:35 It looks operating system, actually a so-called it system. It’s actually the second process after the Linux kernel that started on their system. It has this idea of one. And it’s essentially the parent of all your processes because on Linux, you have all these parents and children, you use forking to make new purchases. And so this is your system puts this manager, but obviously system D if it’s your system puts his manager, you already trusted with all the systems services, you can also trust them with your application, right? I mean, who else would you trust? Even if you choose some other purchase manager, because there are many, essentially you will have to wrap up that process manager being assistant disservice, because otherwise you wouldn’t have this connection of system, the being Supreme supervisor of your application, right? When one of your services travel, you want it to be restarted and continue.

Josef Strzibny 00:17:36 So that’s what a system could do for you. If you kind of designed everything as a system disservice for base packages, like based, because he called, they already come with a services is very easy to use. You just simply it and it’s running. And then for your application, you would write system you service, which is a little file. There are some directives it’s kind of very simple and straightforward because before, before system, the people are using the services with bash and it was kind of error prone, but now the system gets, it’s quite simple. There are just a set of directives that you learn, tell system, you know, under what user you should, um, uh, what working directory you want it to be running is the, uh, environment file. Is there a bit file? And then a few other things, the most important being a directive called XX start, which tells system D what process to start, it will start a process and it will simply oversee it and we’ll look at ours and so on.

Jeremy Jung 00:18:44 So in the past, I know there used to be applications that were written where the application itself would background itself. And basically that would allow you to run it in the background without something like a system D. And so it sounds like now what you should do instead is have your application be built to just run in the foreground and your process manager, like system D can be configured to handle restarting it, which user is running it, environment variables, all sorts of different things that in the past, you might’ve had to write in your own bash script or write into the application itself.

Josef Strzibny 00:19:25 And there is also some other niceties about system D because for example, you can define how reloading should work. So for instance, if you just changed some configuration and you’ve gone to achieve some kind of zero downtime change, your can deploy, you know, you can tell system D how this could be achieved, if your profess, and if it cannot be achieved, because for instance, the application server, it can for processes, and it can actually, it can restart those processes in a way that it will be zero downtime, but when you want to change to volt Puma protests, so what do you do right? And system, do you have this nice thing called activation? And the system is activation. You can make another unit. It’s not a service unit. It’s a soccer unit with many kinds of units instantly. And you would basically make a socket in it that would listen to those connections and then pass them to the application. So vile application is just starting, and then it could be completely normal, which means stopping, starting. Then if you will keep the connections open, keep the sockets open and then pass them when the application is ready to process them.

Jeremy Jung 00:20:45 So it sounds like Andy talking to referring to these would be TCP sockets,

Josef Strzibny 00:20:51 For example, of someone trying to access a website. Yes. But actually worked with Unix, took it as about,

Jeremy Jung 00:20:58 So in that example, let’s say a user is trying to go to a website and your service is currently down. You can actually configure system D to let the user connect and wait for another application to come back up and then hand that connection off to the application. Once it’s back up, You’re basically able to remove some of the complexity out of the applications themselves for some of these special cases and offload those to system D

Josef Strzibny 00:21:31 Because yeah, otherwise you actually need a second server, right? You will have to start to conserve our move traffic there and upgrade or update your first server and exchange them back. And then the second activation, you can avoid doing that and still have this final effect of zero downtime deployment.

Jeremy Jung 00:21:52 So this introduction of system D as the process manager, I think this happened a few years ago where a lot of Linux distributions moved to using system D and there was some, I suppose, controversy around that. And I’m kind of wondering if you have any perspective on why there’s some people who really didn’t want that to happen. You know, why that’s something people should worry about or not. Yeah.

Josef Strzibny 00:22:20 Yeah. There were, I think there were few things. One was, for instance, the system logging that suddenly became a binary format and you need a special utility to read it. You know, I mean, it’s more efficient, it’s innovate better, but it’s not plain text, which all administrators prefer or are used to. So I understand the concern, you know, but it’s kind of like, it’s fine. You know, at least to me, it’s fine. And the second thing is people consistently force some kind of system creep because a system the trying to do more and more every year. So some people say it’s not a Unix way system. They should be very minimal and its system and not do anything else. It’s partially true. But at the same time, the things that system B went into, you know, I think they are essentially easier and nice to use. And this is the system, the services I can say. I certainly prefer how it’s done now.

Jeremy Jung 00:23:22 Yeah. So it sounds like we’ve been talking about system D as being this process manager, when the operating system for spirits system D starts, and then it’s responsible for starting your applications or other applications running on the same machine, but then it’s also doing all sorts of other things. Like you talked about that socket activation use case there’s logging. I think there’s also scheduled jobs. There’s like all sorts of other things that are a part of system D and that’s where some people disagree on, whether it should be one application that’s handling all these things. Yeah.

Josef Strzibny 00:24:02 Yeah. We are out of a scheduling job. Like you’re placing Cron. You have now two nights off to do it. Uh, you can still pretty much choose what you use. I mean, I still use Chrome, so I don’t see a trouble there. You’ll see how it goes.

Jeremy Jung 00:24:19 One of the things I remember a struggled with a little bit when I was learning to deploy applications is when you’re working locally on your development machine, you have to install a language runtime. And a lot of cases, whether that’s for Ruby or Python, Java, anything like that. And when someone is installing on their own machine, the often use something like a version manager, like for example, for Ruby there’s RB and for node, for example, if there’s NVM, there’s all sorts of ways of installing language, run times and managing the versions, how should someone set up their language runtime on a server? Like, would they use the same tools they use on their development machine? Or is it something?

Josef Strzibny 00:25:08 Yeah. So there are several ways you can do, as I mentioned before, with the model packages, if you find the version there, I would actually recommend to do it with the model package, because the thing is, it’s so easy to install, you know, and it’s kind of instant. It takes no time on your cyber is you just install this package, send it to when building a Docker image, because again, it will be really fast. So as you can use it, I would just use that because it’s like kind of convenient, but a lot of people will use some kind of version manager, you know, technically speaking, they can only use the installer part, like for instance, to be used, to be installed, to install new version. Right. But then you would have to reference this full paths to your Ruby and very tedious. So what I personally do, I just really set it up as if I am on the developer workstation, because for me, the mental model of that is very simple.

Josef Strzibny 00:26:11 I use the same thing. And this is to, for instance, then, then you are referencing what to start in this exit start directive and system D you know, because you have several choices. For instance, if you need to start Puma, you could be, you could be referencing address that is like in your user home book, gem, Ruby version number being Puma, or you can use this version manager, they might have something like two B dash exec to run, I version of Ruby, and then you pass it the actual form apart, and you’ll start for you, but then you can also do, and I think it’s kind of beautiful. You can do it is that you can just stop bash the login shell. And then you just give it to bundle exec Puma command that you would use normally after logging. Because if you install it, everything, normally, you know, you have something, you know, mesh profile. There’s a loaded environment that will put their ad version of Ruby and suddenly it works. And I find it’s nice because even when you are late looking in to your box, you login as the user is that application user. And suddenly you have all the environment, then it just can stop things as you are used to, you know, no problem there,

Jeremy Jung 00:27:29 Yeah. Something I’ve run into the past is when I would install a language runtime. And like you were kind of describing, I would have to type in the full path to get to the Ruby runtime or the Python runtime. And it sounds like what you’re saying is just install it like you would on your development machine. And then in the system D configuration file, you actually log into a bash shell and run your application from the bash shell. So it has access to the, all the same things you would have in an interactive logging environment. Is that right?

Josef Strzibny 00:28:05 Yeah. Yeah. That’s exactly right. So it will be basically the same thing and it’s kind of easy to reason about it, you know, like you can start with that by being able to change it later to something else, but it’s a nice way to help to do it.

Jeremy Jung 00:28:19 So you mentioned having a user to run your application. And so I’m wondering how you decide what Linux users should run your applications. Are you creating a separate user for each application you run? Like, how are you making those decisions?

Josef Strzibny 00:28:40 Yes, I am actually making a new user for, for my application. Well, at least for the part of the application, that is the application server and workers, you know, so engine X might have its own user might have its own user. You know, I’m not like trying to consolidate that onto one user, but in terms of application, like whatever I on Puma or whenever I draw on sidekick, that will be part of one user application user. And I will appropriately set the rights exercise directories. So if it’s related for everything else,

Jeremy Jung 00:29:21 Something that I seen also, when you are installing Ruby or you’re installing some other language runtime, you have the libraries, like in the case of Ruby there’s gems. And when you’re on your development machine and install these gems, these packages, they go into the user’s home directory. And so you’re able to install and use them without having let’s say, sudo or root access. Is that something that you carry over to your deployments as well? Or do you store your, your libraries and your gems and some place that’s accessible outside of that user? I’m just wondering how you approach.

Josef Strzibny 00:30:07 I would actually keep it next to my application. This kind of patches, maybe the question or where to put your application files on the system. So there’s something called FHS file system hierarchy standard, you know, to use they, of course, there’s some little modifications here and there and the standards specifically followed by packagers and enforcing packages. But other than that, it’s kind of random, you know, it could be a different path and it says very certain files should go home yes. Or shoes or being for executable for logs and so on and so on. And now when you want to put your, your replication file somewhere, you are thinking, got to put them, right. You have essentially, I think like free options for one, you can put it to home because as you talk about, I set up a dedicated user for that application. So it could make sense to put it in home.

Josef Strzibny 00:31:12 Why I don’t like putting it at home is because there are certain labeling and SEL Linux that kind of makes your life more difficult. It’s not meant to be there essentially on some other system. We, without a ceiling looks, I think it fortified fine. I also did before. You know, it’s not like you cannot do it. You can, then you have kind of your web server, default locations, you know, like user share engine X, HTML, or slash virus slash www. And this systems will be prepared for you with all these excellent labeling. So when you put files there, all things are mostly work. But I also saw a lot of people do that because this particular reason, what I don’t like about it is that if NGS is just my reverse proxy, you know, it’s not that I am serving the files from there. So I don’t like the location for this reason.

Josef Strzibny 00:32:13 If it will be just start things upside, absolutely put it, there that’s the best location. Then you can put it to some arbitrary location, some new ones it’s not conflicting with anything else. If you want to follow all system hierarchy standard, you put it to slash as Ari, you know, and then maybe slash the name of the application or your domain name, whose name you can choose. What do you like? So this is what I do now. I simply do it from scratch to this location. And I spiraled the se Linux. I simply make a model, make a profile an hour, all this pastoral work. And so to answer your question there, I would put, this is actually go to this directory, be like slash app slash gems. For instance,

Jeremy Jung 00:33:03 There’s a few different places. People could put their application, they could put it in the user’s home folder. But you were saying because of the built-in se Linux rules as C Linux is going to basically fight you on that and prevent you from doing a lot of things in that folder. What you’ve chosen to do is to create your own folder that I guess you described it as being somewhat arbitrary, just being a folder that you consistently are going to use in all your projects. And then you’re going to configure se Linux to allow you to run whatever you want to run from this custom folder that you’ve decided.

Josef Strzibny 00:33:41 Yeah. You can say that you do almost the same amount of, for, for home or sagittal patient by simply find it cleaner to do it this way and innovate. You’ve unfulfilled the FHS, a suggestion to put it to flash SRV, but yeah, it’s completely arbitrary. You can choose anything else. CIS admins choose www or whatever they like, and it’s fine. I do work. There’s no problem there. And for the jams, actually, they could be in home, you know, but I just instruct bundler to put it to that location next to my

Jeremy Jung 00:34:20 Okay. Rather than having a common folder for multiple applications to pull your libraries or your gems from you have it installed in the same place as the application. And that just keeps all your dependencies in the same place. And the example you’re giving, you’re putting everything in slash SRV slash and then maybe the name of your application. Is that right? Yeah. Okay. Yeah, because I’ve noticed that just looking at different systems, I’ve seen people install things into slash opt installed into slash SRV. And it can just be kind of tricky as somebody who starting out to know, where am I supposed to put this stuff? So, so basically it sounds like just pick a place and at least if it’s in slash SRV, then CIS admins who are familiar with the standard file system hierarchy will know to look in their

Josef Strzibny 00:35:12 Yeah. Yeah. All PT is also a common location, as you say, or, you know, if it’s actually a package application for the right can even be in slush users like share, you know, it might not be necessarily locations. We talked about before.

SE Radio 00:35:29 I see radio listeners. We want to hear from you, please visit sc-radio.net/survey to share a little information about professional interests and listening habits. It takes less than two minutes to help us continue to make se radio even better. Your responses to the survey are completely confidential. That’s S e-radio.net/survey. Thanks for your support of the show. We look forward to hearing from you soon.

Jeremy Jung 00:35:55 One of the things you cover in the book is setting up a deployment system and you’re using shell scripts in the case of the book. And I was wondering how you decide when shell scripts are sufficient and when you should consider more specialized tools like Ansible or chef puppet things like,

Josef Strzibny 00:36:16 Yeah, I chose bash in the book because you get to see things without obstructions. You know, if I would be using lots of Ansible and suddenly we are writing some animal files and you are using a lot of Python modules to Ansible use, and we don’t really know what’s going on at all times. So you learn to do things with Ansible 2.0, let’s say, and then you answer what comes out and you have to rely on, you know, and I’ve got to rewrite the book. But the thing is that we’ve just passed. I can show literally just fish commands, like, okay, you run this and this happens. And another thing I use is that you realize how simple something can be like, you can have a cluster with SSH and whatever in maybe 20 mesh commands around that. So it’s not necessarily that difficult and it’s much easier to actually understand it if it’s just those 20 mesh commands.

Josef Strzibny 00:37:22 I also think that learning a little bit more about bash is actually quite beneficial because you encounter them in various places. I mean, RPM spec files, like the packages are built that Spanish language version managers like buy an RBF that’s bash. If you want to take it, if you have a bug there you might look into, so, and try to fix it, you know, it will be bash. Then Docker files are essentially bash, you know, their entry points gives my baby. So it’s not like it can be the effort bash. So maybe learning a little bit, cause a little bit more than, you know, and be able to be more comfortable. I think it can get you a long way because even I am not a best programmer, you know, I would never call myself like that. But also consider this like, you can have full feature rails application, maybe in 200 lines of code up and running somewhere.

Josef Strzibny 00:38:14 You can understand it in a afternoon. So for a small deployment, I think it’s quite refreshing to use splash. And some people miss out on not just doing the first simple thing possible that they can do, but obviously when you go like more team members, more complex applications or a suite of applications, things get difficult, very fast with bash. So obviously most people will end up with some high level tool. It can be Ansible. It can be chef, maybe Kubernetes, you know? So my philosophy again, it’s just to keep it simple. If I can do something with Pash, it’s like 100 lines. I will do this bag because when I come back to it in, after three years, it will work and I can dive to see what I have to fix. You know, if there’s a book, a sequel update at this new location, for whatever, I need to know what to look and what to change. And with high-level tooling, you kind of have to stay on top of them, the new versions and updates. So that’s very limited, but it’s kind of refreshing for very small deployment you want to do for your project.

Jeremy Jung 00:39:25 Yeah. So it sounds like from a learning perspective, it’s beneficial because you can see line by line and it’s code you wrote and you know exactly what each thing does. But also it sounds like when you have a project that’s relatively small. Maybe there, there aren’t a lot of different servers or the deployment process isn’t too complicated. You actually choose to start with bash and then only move to something more complicated like Ansible or even Kubernetes. Once your project has gotten to a certain size,

Josef Strzibny 00:39:59 You’ll see it in the book. I even explain it multiple server deployment using bash, or you can actually keep your components like kind of separate. So like your database have its own play cycle has its own the postscript and your roadblocks are the same and you have application servers, maybe you have more of them. So the nice thing is that menu first variety of first script to provision month server configured one server, then you simply write another supervising script, call this single script just in the loo and you will change the server variable to change the IP address or something. And suddenly you can deploy more. Of course it’s very basic and it’s, uh, you know, it doesn’t have some kind of personalization to it or whatever, but if you have like three application servers, you can do it and you understand it almost immediately. You know, if you are already a software engineer, there’s almost nothing to understand. You can just start and keep going.

Jeremy Jung 00:41:04 When you’re deploying to server. As a lot of times you’re dealing with credentials, whether that’s private keys, passwords, or keys to third-party API APIs. And when you’re working with this self hosted environment, working with bash scripts, I was wondering what you use to store your credentials and how those are managed.

Josef Strzibny 00:41:27 I at the suffocation called password safe that can save my and whatever, and you can also put their SSH keys and so on. And then I simply can do I bet out of this keys and have this passport to some other secure physical location, but basically I don’t use any service online for that. I mean, they are services for that, especially for teams and in clouds, especially the, because they might have their own services for that. But for me personally, again, I just keep it as simple as I can. It’s just on my, my computer, maybe my desk and that’s it, nowhere else.

Jeremy Jung 00:42:10 So would this be a case of where on your local machine, for example, you might have a file that defines all the environment variables for each server. You don’t check that into your source code repository, but when you run your bash scripts, maybe read from that file and use that and deploying to the server

Josef Strzibny 00:42:31 And speaking. Yes, but I think before I else, there’s a nice option to use their encrypted credentials. So basically then you can commit all these kids together with your app. And the only thing you need to keep to yourself is just like one variable. So it’s much more easy to store it and keep it safe because it’s just like one thing and everything else you keep inside your repository. I know for sure there are other programs that we have in the same bay that can be used with different stacks that doesn’t have this back then, because have it Macon. But if you are using Django, if you are using actually or whatever, then they don’t have it. But I know that there are some programs. I don’t remember the names right now, but essentially LMU do exactly the same thing to just commit it to source control, but in a secure way, because it’s treated,

Jeremy Jung 00:43:26 Yeah, that’s an interesting solution because you always hear about people checking in passwords and he’s into their source code repository, and then, you know, it gets exposed online somehow. But in this case, like you said, it’s encrypted and only your machine has the key. So that actually allows you to use the source code to store all that.

Josef Strzibny 00:43:49 Yeah. I think for teams, you know, four more cups of deployments that are three or stools from HashiCorp vault, you know, to some call providers things, but you can really start and keep it very, very simple

Jeremy Jung 00:44:01 For logging an application that you’re self hosting. There’s a lot of different managed services that exists, but I was wondering what you used in a self hosted environment and whether your applications are logging to standard out, whether they’re writing the files themselves. So I was wondering how you typically approach that for,

Josef Strzibny 00:44:22 There are a lot of logs you can have, right from system logs or observer log application, logs, whatever, and you somehow need to stay on top of them because when you have one server, it’s fine to just look in, in and look around. But then there are more servers involved. It’s kind of a pain. And so people will start to look in some centralized logging system. I think when you are more mature, you will look to things like Datadog, right? Or you will build something of your own on . That’s what we do on the project I’m working right now. But I kind of think that there is some upfront cost, uh, setting it all up, you know, and in terms of some looking vast extent, we are essentially building your logging application. Even you can say, you know, there’s a lot of work. I also want to say that you don’t look into your logs all that often, especially if you set up proper error and performance monitoring, which is what I do.

Josef Strzibny 00:45:24 If my project is one of the first thing I do, so does a service like Rollbar and skylight. And there are some that you can self host. So if people want to sell, host them, they can. But I find it kind of easier to, even though I’m forcing my application to just rely on this hosted solution, like drove our skylight up signal. And I have to say, especially I started to like up six now, recently because they kind of bundle everything together. If knew you have trouble with your self esteem, the last thing you want to find yourself in a situation when your self hosted logs and sources, error reporting also went down and doesn’t work. You know? So I like self-esteem by application. I kind of like to offload this responsibility to some hosted hosted providers.

Jeremy Jung 00:46:18 Yeah. So I think that in and of itself is an interesting topic to cover because we’ve mostly been talking about self hosting, your applications, and you were just saying how logging might be something that’s actually better to use a managed service. And I was wondering if there’s other services, for example, CDNs or other things where it actually makes more sense for you to let somebody else host it rather than yourself.

Josef Strzibny 00:46:46 I think that’s a depends logging for me, it’s obvious. And that I think a lot of developers kind of fear that amazes. So there are, there have some kind of, one thing that the base, you know, replication and all the, just back then. So I think a lot of people would go for to base, although it may be one of those prices services. It’s also likes one that actually gives you a peace of mind. You know, maybe I would just like find out that even though you get all these automatic backups and so on, maybe you should still try to make your own backup just for sure. Even someone promised something, oh, your data is usually the most valuable thing to have in your application, so should not lose it. And some people will go maybe for load balancers because it’s maybe easy to start.

Josef Strzibny 00:47:35 Like let’s say on DigitalOcean, you know, it was just click it and it’s there. But if you go opposite direction, if you, for instance, decide to sell wholesale or a loan bouncer, it can also give you more options what to do with that, right? Because you can configure it differently. You can even configure it to be a backup server. If all of your application servers go down would is maybe it could be interesting use case, right? If you mess up and your application servers are not Channing because you are just messing with them suddenly it’s okay, because you’re all the answers, just take some traffic, right. And you can do that. The ones hosted are sometimes limited. So I think it comes to also, even though the database is, you know, it’s like maybe you use some kind of extension that is simply not available. That kind of makes you sell or something, but if they offer exactly what you want and it’s really easy, you know, then maybe you just, you just do it. And that’s why I think I kind of like between two machines and the cloud, because you can mix and match all the services, do what you want and let’s change the configurations to met your needs. And I find that quite nice.

Jeremy Jung 00:48:54 One of the things you talk about near the end of your book is how you start with a single server. You have the database, the application, the web server, everything on the same machine. And I wonder if you could talk a little bit about how far you can take that one server in and why people should consider starting with that approach.

Josef Strzibny 00:49:15 It depends a lot on your application. For instance, I write applications that are quite simple in nature. I don’t have so many SQL calls in one page and so on, but the applications I worked for before, sometimes they are quite heavy and you know, even this little traffic, they suddenly need a more beefy server. You know, so it’s a lot about application, but there are certainly a lot of good examples out there. For instance, the team from explain flight simulator, they just deployed to one on server, you know, the whole backend, all those flying players, because it’s essentially simple. And they even use Alec server, which is based on being VM, which means it’s great for concurrency for distributed systems is great for multiple servers, but they’re still deployed to one because it’s simple. And they use the second only when they do updates to the service and otherwise they go back to one, another one would be maybe at Peter’s levels.

Josef Strzibny 00:50:15 It’s like maker that already has like $1 million business. And it’s, he knows all of his projects on one server because it’s enough why you need to make it complicated. You can go and a very profitable service and you might not leave on server. It’s not a problem. Another good example, I think is they have, I think they have some page when they exactly show you what servers they are running. They have multiple servers, but the thing is they have only a few servers. So those are the examples that goes against maybe the chance of spinning up hundreds of servers in the cloud, which you can do. Maybe it’s easier when you have to do auto scaling because you can just go little by little, you know, I don’t see the point of having more servers. To me, it means more work if I can do it, if I do it with one, but I would mention one thing to pay attention to when you are on one server, you don’t want suddenly your background workers, EXOS all the CPU so that your database cannot serve your Claritas anymore.

Josef Strzibny 00:51:22 Right? So for that, I recommend looking into control groups or see groups on Linux. When you create a simple slides that you define how much CPU power, and how much memory can be used for that service, and then you attach it to do some processes. And then there are talking about system the services, they actually have this one directive specify your C group slice. And then when you have this worker server, and maybe it even forks because it drowns some utilities, right, for you to process images or whatnot, then it will be all contained within that C group. So it will not influence the other services you have. And you can say, okay, you know, I give workers service only 20% of my CPU power because I don’t care if they make it fast or not. It’s not important. Important is that every visitor still gets its page. You know, and it’s the I’m working, waiting for some big Broncos and your service is not going down.

Jeremy Jung 00:52:33 So it sort of sounds like the difference between if you have a whole bunch of servers, then you have to have some way of managing all those servers, whether that’s Kubernetes or something else. Whereas an alternative to that is having one server or just a few servers, but going a little bit deeper into the capabilities of the operating system, like the C groups you were referring to, where you could specify how much CPU, how much Ram and things for each service on that same machine to use. So it’s kind of changing. I don’t know if it’s removing work, but it’s changing the type of work you’re doing.

Josef Strzibny 00:53:11 Yeah. You essentially maybe have to think about it more in a way of this case of memory or CPU power, but also it enables you to use, for instance, Unix sockets instead of TCP sockets and they are faster, you know, so in a way it can be also an advantage for you in some cases to actually keep it on one server. And of course you don’t have a network trip. So another saving. So together that service will be faster as long as it’s running and there’s no problem, it will be faster. And for high availability. Yeah. It’s obviously a problem if you have just one server, but you also have to think about it in more complex way to be high with all your components, from load balancers to databases, you certainly have a lot of servers to take care of. And that set up might be complex, might be fragile. And maybe you are better off with just one server that you can quickly spin up again. So for instance, there’s any problems with your server. You get, and you simply make new ones, you know, and if you can configure it within 20, 40 minutes, maybe it’s not the trouble. And maybe even you are still fulfilling your service level contract for our time. So I think if I can go this way, I prefer it simply because it’s somewhat easy to think about it like that.

Jeremy Jung 00:54:34 This might be a little difficult to answer, but when you look at the projects where you’ve self hosted them, versus the projects where you’ve gone all in on say AWS, and when you’re trying to troubleshoot a problem, do you find that it’s easier when you’re troubleshooting things on a VM that you set up or do you find it easier to troubleshoot when you’re working with something that’s connecting a bunch of managed services?

Josef Strzibny 00:55:05 Absolutely. I find it much easier to do anything. I set on myself, especially we find servers even easier, but simply the fact that you build it yourself means that you know how it works. And at any time you can go and fix your problem. You know, this is what I found a triple on beef, uh, services like digital ocean marketplace and know how they call this south hosted apps that you can like one click and have your rails chain go up, up and running. I actually used, when I wasn’t that skilled, I use only the solution called turnkey Linux. It’s the same idea. You know, it’s like that they prepare the profile for you and then you can just easily run it as if it’s a completely hosted thing like her cool. But actually it’s your server and you have to pay attention, but actually don’t like it because you didn’t set it up. You don’t know how it’s set up. You don’t know if it has some problems, some security issues. And especially the people that come from the services in an app running something. And they don’t know. I believe they don’t know because when I was running it, I didn’t know. Right. So download even know what they are running. So if you really don’t want to care about it, I think it’s completely fine. There’s nothing wrong with that. But just go for that gender or Heroku and make your life easier, you know?

Jeremy Jung 00:56:33 Yeah. It sounds like the solutions where it’s like a one-click install on your own infrastructure and you get the bad parts of both. Like you get the bad parts of having this machine that you need to manage, but you didn’t set it up. So you’re not really sure how to manage it. You don’t have that team at Amazon who can fix something for you because ultimately it’s still your machine. So that could have some issues there.

Josef Strzibny 00:56:58 Yeah, yeah, exactly. I would not say commanded or if you really decide to do it, at least really look inside, you know, try to understand it, try to laminate, then it’s fine. But just to spin it up and hope for the best, it’s not the way to go

Jeremy Jung 00:57:13 In the book. You cover a few different things that you use such as Ruby on rails, engine X, Reddis PostgreSQL. I’m assuming that the things you would choose for applications, you build in self hosts. You want them to have as little maintenance as possible because you’re the one who’s responsible for all of it. I’m wondering if there’s any other applications that you consider a part of your default stack that you can depend on and that the maintenance burden is low.

Josef Strzibny 00:57:46 Yeah. So exactly right. If I can, I would try to minimize the amount of dependencies I have. So for instance, I would think of using, let’s say elastic search, even though I used it before. And it’s great for what it can do. If I can avoid it, maybe I will try to avoid it. You know, you can have descent, full-text search with Postgres today. So as long as it’s the bar, I would personally avoid it. I think one of the relations that the base let’s say that this is kind of necessary, you know, I’ve worked a lot with Aloxi recently, so we don’t use Shreddies for instance. So it’s quite nice that you can limit the number of experiments is by just choosing a different stack. Although then you have to do that application in a little different way. So sometimes even in circumstances, so this could be useful, you know, I think it’s not difficult to run it.

Josef Strzibny 00:58:47 So I don’t see a problem there. I would just say that services like elastic search, they might not come with a good authentication option. For instance, I think asterix search offers it, but not in the free version. You know? So I would just like to say that if you are deploying a component like that, that you can just keep it completely open to the world, you know, and maybe if you don’t want to pay for a version that has it, or maybe are using it at best, it doesn’t have to completely, you can maybe build out just a little bit tiny proxy that would just authentication and pass this reckless back and forth because what you could do, but just not forget that you might, John something,

Jeremy Jung 00:59:37 I’m wondering if there is any other applications or capabilities where you would typically hand off to a managed service rather than trying to deal with yourself.

Josef Strzibny 00:59:48 Oh, sending emails. Not because it’s hard. It’s actually surprisingly easy to start sending out emails. But the problem is that the vulnerability part, right? You want your emails to be delivered. And I think it’s because of the amount of spam everybody’s sending, it’s very difficult to get into people’s boxes. You know, you simply be flagged, you have some unknown address if it’s just not. So actually building up some history of some IP address, it could take a file. It could be very annoying and you don’t even know how to debug it. You cannot deliver right Google, Hey, I’m just like this nice little servers for just consider me. You cannot do that. So I think it’s kind of a travel. So I would say for email to this, another thing that just go with a hosted option, you might still configure your server to be sending up emails.

Josef Strzibny 01:00:43 That could be useful. For instance, if you want to do some little thing, like scanning your system log, and then you see some troublesome logging in, or that shouldn’t happen or something, and maybe you just want an active on email to be sent to you that something fishy is going on. And so you can still set up even your server, not just your main application and might have a nice library for that, you know, to send that email, but you will still need the so-called relay server. Just pass your email further. Yeah, because building this trust and email world, that’s not something I would do. And I don’t think as a, you know, independent in the maker developer, you can really have resources to do something like that. So that will be a perfect example for that. Yeah.

Jeremy Jung 01:01:29 Yeah. I think that’s probably a good place to start wrapping up, but is there anything we missed that you think we should have talked about?

Josef Strzibny 01:01:37 I think we kind of covered it. Maybe we didn’t talk much about containers that a lot of people nowadays use. Maybe I would just like to point out one thing with containers is that you can again do just very minimal approach. I don’t think containers, you know, you don’t need to go full on containers at all. You can just run a little surveys, maybe your workers in a container. For example, if I want to run something as part of my application, the ups team, the developers that develop this one component already provide a Docker file. It’s very easy like to stock, right? Because you just deployed their image and you run it, that’s it. And they didn’t have to learn what kind of different status is a Java. Is it bison how I would try it? So maybe you care for your own application, but when you have to just take something that’s already made and it has a Docker image, you just see the nice way to start.

Josef Strzibny 01:02:35 And one more thing I would like to mention is that you also don’t really need using services like Docker hub, you know, but most people would use it to host their artifacts that are built images so they can quickly pull them off and start them on many, many servers and blah, blah. But if you have just one server like me, but you want to use containers, a nice thing is to just, you know, push the container directly. You essentially, it’s just an archive. And that archive, there are a few folders. They represent the layers. That’s still as you build it and the Docker file and that’s it. You can just move it around like that. And you don’t need any external services to run your containerized little service.

Jeremy Jung 01:03:18 Yeah. I think that’s a good point because a lot of times when you hear people talking about containers, it’s within the context of Kubernetes and you know, that’s a whole other thing you have to learn. You have to learn not only how containers work, but you have to learn how to deploy Kubernetes, how to work with that. And I think it’s good to remind people that it is possible to just choose a few things, run them as containers. You don’t need to, like you said, even run everything as containers. You can just try a few things. Where can people check out the book and where can they follow you and see what you’re up to.

Josef Strzibny 01:04:00 So they can just go to deployment from scratch.com. That’s like the home page for the book. And if they want to follow up, they can find me on Twitter. That would be slash S T R I B and Y J like J. And I try to post updates there, but also some news from Ruby, Alex here, Linux world. So they can follow.

Jeremy Jung 01:04:33 I had a chance to read through the alpha version of the book. And there’s a lot of really good information in there. I think it’s something that I wish I had had when I was first starting out, because there’s so much, that’s not really talked about, like when you go look online for how to learn Django or how to learn Ruby on rails or things like that, they teach you how to build the application, how to run it on your laptop. But there’s this very large gap between what you’re doing on your laptop and what you need to do to get it running on a server. So I think anybody who’s interested in learning more about how to deploy their own application or even how it’s done in general. I think they’ll find the book really valuable. Yeah.

Josef Strzibny 01:05:25 Thank you. Thank you for saying that and makes me feel happy. And as you say, that’s the idea, I really like kind of everything you need in that book. And I just use, it’s easier to follow and keep it and the obstructions, and then maybe you will learn some other tools and you will apply the concepts, but you can do whatever you want.

Jeremy Jung 01:05:49 All right. Well, you said thank you so much for talking to me today.

Josef Strzibny 01:05:52 Thanks for Jeremy.

Jeremy Jung 01:05:53 This has been Jeremy Jones for software engineering radio. Thanks for listening.

SE Radio 01:05:58 I see radio listeners. We want to hear from you please visit e-radio.net/survey to share a little information about your professional interests and listening habits. It takes less than two minutes to help us continue to make se radio even better responses to the survey are completely confidential. That’s S e-radio.net/survey. Thanks for your support of the show. We look forward to hearing from you soon. Thanks for listening to se radio an educational program brought to you by AAA software magazine for more about the podcast, including other episodes, visit our [email protected]. To provide feedback. You can comment on each episode on the website or reach us on LinkedIn, Facebook, Twitter, or through our slack [email protected]. You can also email [email protected], this and all other episodes of se radio is licensed under creative commons license 2.5. Thanks for listening.

[End of Audio]

SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

SE Radio 477: Josef Strzibny on Self Hosting Applications

Show Notes

Related Links

Transcript

Join the discussion

More from this show

SE Radio 712: Dan Lorenc on Sigstore

SE Radio 711: Scott Hanselman on AI-Assisted Development Tools

SE Radio 710: Marc Brooker on Spec-Driven AI Dev

Menu

Recent posts

Search

Search

SE Radio 477: Josef Strzibny on Self Hosting Applications

Show Notes

Related Links

Transcript

Join the discussion

More from this show

SE Radio 712: Dan Lorenc on Sigstore

SE Radio 711: Scott Hanselman on AI-Assisted Development Tools

SE Radio 710: Marc Brooker on Spec-Driven AI Dev

Menu

Recent posts