Search
lukas gentele

SE Radio 649: Lukas Gentele on Kubernetes vClusters

Lukas Gentele, CEO of Loft Labs, joins host Robert Blumen for a discussion of kubernetes vclusters (virtual clusters). A vcluster is a kubernetes cluster that runs kubernetes application on a host kubernetes cluster. The conversation covers: vcluster basics; sharing models; what is owned by the vcluster and what is shared with the host; attached nodes versus shared nodes; the primary use case: multi-tenancy vcluster per tenant; alternatives – namespace per tenant, full cluster per tenant; trade-offs – isolation; less resource use; spin up time; scalability; how many clusters and how many vclusters should an org have? Deployment models for vclusters – helm chart with standard resources; vcluster operator; persistent storage models for vclusters; vcluster snapshotting, recovery, and migration. how many vclusters can run on a cluster? ingress, TLS and DNS. Brought to you by IEEE Computer Society and IEEE Software magazine.



Show Notes

Related Episodes


Transcript

Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.

Robert Blumen 00:00:19 For Software Engineering Radio, this is Robert Blumen. I have with me today Lukas Gentele, the CEO of Loft Labs. Lukas is a maintainer of the open-source projects, vCluster.com, DevPod.sh, and DevSpace.sh. And he is a speaker at KubeCon and other Cloud Computing conferences. Lukas, welcome to Software Engineering Radio.

Lukas Gentele 00:00:45 Great to be here, Robert. Thanks for inviting me on the show.

Robert Blumen 00:00:48 Would you like to tell the listeners anything else about your background that I didn’t cover?

Lukas Gentele 00:00:53 Well, you mentioned all the open-source projects that I’m a startup founder. Yeah, very deeply connected to the Kubernetes ecosystem, to the open-source world. Maybe one thing that you haven’t mentioned yet, I didn’t grow up in the States. I grew up in Germany. Moved here about like six years ago or so, and yeah, very excited to talk a little bit more about specifically all vCluster project today.

Robert Blumen 00:01:16 Yeah, and I will mention that we have an international audience of listeners. The show was founded in Germany, and Germany is one of our top listener countries by percentage. So I’m sure many Germans will be listening to this podcast. Today, Lukas and I will be talking about vClusters. We have quite a lot of content in the archives about Kubernetes clusters that listeners could listen to get up to speed on that, including Episode 590 on How to Set-up a Cluster. Let’s not review that. Let’s dive into Kubernetes vClusters. What is a vCluster, and how does it differ from, what do we call it, a ‘base cluster’ or a ‘normal cluster’? What’s the term you use that’s not a vCluster?

Lukas Gentele 00:02:06 I typically refer to it as a traditional Kubernetes cluster. And then the virtual cluster is something that runs on top of this traditional cluster. We also use it as the term called host cluster. When you have multiple virtual clusters running on the same cluster, that underlying cluster we refer to as the host cluster. The difference between the two ultimately is, Kubernetes cluster is made out of machines. Whether that is bare metal machines or virtual machines, ultimately it’s about how do we schedule containers across a set of machines. And each Kubernetes cluster has these machines attached to these nodes. And some cloud providers allow you to auto scale your nodes to ultimately, add and remove nodes dynamically depending on how many containers you have running. But you can’t have a dynamic allocation of nodes to multiple Kubernetes clusters.

Lukas Gentele 00:03:01 So when you have two Kubernetes clusters and you have one node, you got to put it in either one of these clusters, you can’t share that node across two Kubernetes cluster. A virtual cluster uses the nodes of the underlying cluster. So typically virtual cluster itself doesn’t have any compute nodes. You can obviously attach dedicated compute nodes to it if you wish to do so. But the big benefit of it is, it uses the nodes and the infrastructure of the underlying cluster. So it’s a really great solution for multi-tenancy. If I’m looking at a Kubernetes cluster and I want to share this cluster, that’s really hard to do actually. And that is actually not obvious because when you’re thinking of Kubernetes, there’s obviously role-based access control, there’s users and groups. So you’d think it’s possible to share it. There’s Namespaces and Kubernetes as a unit to separate things a little bit. But again, I usually tell people when you think of a physical server, you also have users and groups and permissions and folders. But it’s still very hard to share a Linux host if you don’t have virtualization. And the same way it’s really hard to share a Kubernetes cluster if you don’t have virtualization for Kubernetes. And that’s ultimately what we cluster adds on top of a Kubernetes cluster. It adds that virtual layer to give everybody their dedicated isolated space while it’s still sharing the underlying cluster and its node.

Robert Blumen 00:04:31 If I could summarize what you said, the key point about a virtual cluster to understand what is it, it is a Kubernetes cluster that runs inside of a host Kubernetes cluster and it does have some of its own services and then it shares the nodes with the host. Was there anything about that you’d like to correct?

Lukas Gentele 00:04:55 No, that’s an accurate summary. That’s exactly the idea. Some things are shared, certain things are completely isolated and that’s the beauty of the virtual cluster. You can mix and match, ultimately.

Robert Blumen 00:05:05 Does each vCluster have its own isolated control plane?

Lukas Gentele 00:05:10 That is correct, yes. The virtual cluster ultimately is a container. So in this container you have a fully-fledged Kubernetes control plane. That means you have an API server, you have a controller manager, you have state that state can live in a SQL light database or in a full-blown CD cluster, ? Like a real Kubernetes cluster would be using. The only thing that it doesn’t have, which a regular Kubernetes cluster has in its control plane, is a scheduler because the underlying cluster has a scheduler and the scheduler distributes the odds to the different nodes. The virtual cluster typically doesn’t have any nodes or may just have some optional node attached to it, but usually it uses the underlying cluster scheduler to actually get the containers launched.

Robert Blumen 00:05:58 If I heard you correctly, you said the vCluster control plane is a container. Did you mean a single container or does each piece of the control plane have its own container?

Lukas Gentele 00:06:11 Yeah, it’s actually one part and a couple of containers in the pod. That’s correct.

Robert Blumen 00:06:16 In a control plane, if you’re on a large enough traditional cluster, you might scale out different parts of the control plane differently. Like you might have three or five instances of ET CD for high availability, and you might scale your API server horizontally. If you are putting your entire control plane in one pod, do you have to decide upfront what size of resource you give to each piece and thatís pretty much fixed for the duration of the vCluster?

Lukas Gentele 00:06:53 Yeah, so we actually do something really interesting. So when you look at the multiple parts of the control plane, so there’s one part that contains pretty much all the core components. So controller manager and API server, we actually bake them in a single container. But then you have things, for example, for the DNS like core DNS, we have two options to have it baked in or to run it separately as a separate container, even as a separate pod to launched. And that way you have that flexibility of what do you want to run baked in and what do you want to run separately? Typically to run things separately makes a little bit heavier weight. And running them embedded, which is our default typically makes it much more lightweight. The same comes for the data store. So one thing we do for example, is when you spin up a vCluster, you just go to vCluster.com now and run the quick start.

Lukas Gentele 00:07:43 You download the CLI, we have this command called vCluster Create in the CLI that helps you spin up a vCluster. It ultimately just sets a couple of config options and then runs a helmet installed. It’s nothing more than that, but it spins up a virtual cluster in the most lightweight form. Because we know people who just want to get started want to see that wow effect, ? And there’s a lot of things that can go wrong if you want to spin up a fully-fledged ET CD fetched CD cluster. What we do instead is we even bake in the data store. So for example, the data store is just a SQL Light in a persistent volume. And you could even disable the persistent volume and it would be entirely ephemeral. You restart the container, the cluster is completely reset. And so the vCluster is pretty dynamic. It can be as ephemeral and as lightweight as you want it to be. Also on the other extreme to be as heavyweight as you want it. And you can horizontally scale pretty much each component of the vCluster pretty easily. So it’s really depending on your use case and how much resilience and separate scalability for each component you actually need for that particular scenario that you’re running the vCluster in.

Robert Blumen 00:08:54 I want to highlight one aspect of what you just said. If you want the vCluster to be durable, you need to make some arrangement for it to access a persistent volume or the persistence piece. Anything you’d like to expand on that?

Lukas Gentele 00:09:12 No, absolutely. If your Kubernetes cluster has persistent volume claim provisioning enabled, like dynamic provisioning of persistent volumes, that is what we’re using by default. So we look into the cluster when you run vCluster, create and actually see, hey, is that possible? And then we provision the PV that way, which is obviously super straightforward. Most Kubernetes clusters, even like docker desktop and mini cube. Have that either by default enabled or let you enable that with just a single CLI command or a click in the docker desktop UI. But obviously when you are in cloud environments, sometimes regulated industries don’t want dynamic provisioning. Every PV needs to manually get provisioned. Hopefully you don’t have to be in that strict of an environment. But if you are, then you can also specify these things via the value YAML. We call it vCluster YAML, which is essentially the central file where you have all the config options available to you, and then you can apply that with your vCluster create command or with the helmet install command.

Robert Blumen 00:10:14 You have up to this point mentioned most of the pieces of the control plane, but not the networking. Does the vCluster share in the post clusters networking or does it run its own network?

Lukas Gentele 00:10:28 So with regards to the networking, the virtual cluster uses the underlying clusters network. That’s because the underlying clusters, nodes are being used. And that’s obviously where most of the networking is happening other than DNS, ? DNS is actually something that we run, as I mentioned earlier, as part of the control plane of the virtual cluster. So there’s a separate DNS for each vCluster, which makes sense because things in DNS are based on the names of your Namespace. And if you have virtual clusters and they all have a Namespace called database, you would have conflicts if you were to use, you would try to map that to the underlying DNS. That’s why every week cluster gets its own DNS. But when you actually look at the IP addresses and the network traffic between containers or from the internet to a container or from a container to the internet. Or within your VPC, all of that runs on the nodes and on the network that your actual Kubernetes cluster, your host cluster is a part of.

Robert Blumen 00:11:32 Are there any best practices about how you launch these in the Namespaces of the host cluster? Does each vCluster get its own Namespace or how do you do that?

Lukas Gentele 00:11:43 Yeah, so you can have multiple vClusters in the same Namespace, but we typically encourage people to have one vCluster per Namespace. That’s definitely best practice we encourage. And mostly the reason for this is every pod that gets launched inside the vCluster, let’s say the vCluster has 20 Namespaces. What typically happens is we got to translate the names of these pods down to the host cluster because the pods get actually launched by the host cluster. And the way that works is we copy them into the Namespace where the virtual cluster runs. So if I have a hundred pods in these 20 Namespaces and I look into the host cluster, I see one Namespace where the vCluster is running with the vCluster pod for the control plane, plus I see these hundred pods that come from the vCluster all in that very same Namespace.

Lukas Gentele 00:12:34 But if you have multiple vClusters in there, it’s much more difficult to get an overview of what belongs to which vCluster. We obviously have some prefixes and suffixes et cetera, to make it clearer and more understandable, we set labels as well in this process. To make it filterable, et cetera. But it’s just so much easier if you split it up by Namespace. And the added benefit is also if you are introducing things like network policies or you’re using things like Havano or open policy agent. I know you had a session with Jim the other day. About Havano and policies and Kubernetes. It’s very important to set these policies on a Namespace level because a lot of these constructs in Kubernetes are designed to be used at the Namespace level. You can use it with labels as well, but the chance that you’re going to make a mistake, is going to be much, much higher. So we typically recommend do all of this on a Namespace level and one B-cluster per Namespace.

Robert Blumen 00:13:33 Okay. Now you did mention that this is a great solution for multi-tenancy. I can think of at least two other ways you might handle that. One would be put each tenant in its own Namespace. Another would be to give each tenant their own Kubernetes cluster. Could you give us some pros and cons of these different approaches?

Lukas Gentele 00:13:57 Yeah, absolutely. Let’s start with the Namespaces. If you give every tenant a Namespace, I mean in a way with a vCluster, as I just said. Every vCluster should run in a separate Namespace. So in a way your kind of doing that already with your tenants. The benefit of having the vCluster layer on top, rather than giving tenants directly, the Namespace lies in giving the tenants the autonomy that they actually need. If you are restricted to a single Namespace in Kubernetes, it’s kind of like if I gave you just a single folder on a shared Linux host and you had minimal permissions to do anything. So if you don’t have root access to that machine and you canít install anything in this Debian system or something like that, then you’re going to be very, very limited. And every tenant will now need to agree on certain things.

Lukas Gentele 00:14:48 We actually had a lot of this going on like in the late nineties and the first like kind of internet wave where people had these like folders, web space sharing. Where hostesses were essentially giving you these limited capabilities to host your little website. But everybody had to agree on what PHP version, or something is running on these servers. With virtualization, everybody can roll their own and they have a lot more freedom. You feel like you’ve got a full-blown server, and that autonomy is really important for engineers to do their work and to move fast. When you give somebody a Namespace, you’re going to be super limited. Similarly limited, one of the most pressing limitations is cluster wide objects. In Kubernetes for example, CRDs, Custom Resource Definitions. Are things that allow you to extend Kubernetes. And when you look at most helm charts out there and most tools designed for Kubernetes take some of the popular ones like Istio, Arbor CD, all of these tools introduce their custom CRDs, and many companies are building their own custom CRDs.

Lukas Gentele 00:15:53 They may have a CRD for a backend and front end and when they’re building these CRDs, CRDs are operated at a cluster wide level. So there’s no way for two tenants to work on the same CRD in the same cluster and not get in each other’s way. It’s not possible to constrain them to just a Namespace. Another option may be, let’s say you want to, another scenario where you are really limited. Maybe if you are architecting your application to run across three Namespaces. If you just get a single one or I give you three isolated ones and they can’t communicate each other, because you set up network policies as you should. As a cluster admin who wants keep tenants apart. Well now you as an engineer can’t architect your application the way you wish to architect it. It’s very, very limiting.

Lukas Gentele 00:16:40 So that’s the benefit of the vCluster. You really are still in a single Namespace, but it doesn’t feel like it for you. You actually have multiple Namespace, you have cluster objects. You could even choose a different Kubernetes version. Each of these vCluster can have a separate version. They don’t need to be all the same, they don’t need to be the same as the host cluster. And that gives tenants a lot of freedom. And then compared to that to having separate clusters, which obviously gives you the ultimate freedom, but it comes at a very hefty price. So if you were to provision a thousand clusters for a thousand engineers in your organization, that’s a pretty hefty cloud provider bill. And a lot of our customers, like commercial customers that work with us, a large enterprise of Fortune 500s of the world, they have hundreds or even thousands of clusters.

Lukas Gentele 00:17:30 And that’s a really big burden on these operational teams. It’s not just the cost for the compute, it’s also the cost for upgrading things, keeping things and soon you need fleet management suddenly for that large fleet of clusters. It’s a really complicated operation to maintain 500 or 1000 plus Kubernetes clusters and with virtual clusters, it is much, much cheaper and becomes much easier. Because when you think of it, you can now run an Istio in the host cluster and you can share it across 500 virtual clusters. So instead of maintaining 500 Istios, you have to maintain one. And you may not even need automation for that one. I would still recommend automating it. And using things like GitHubs and infrastructures code, et cetera. But the burden becomes much lower in the amount of code and plumbing you need to write around things and the inconsistencies you have between systems.

Lukas Gentele 00:18:28 Becomes much, much smaller when you have fewer host clusters and have virtual clusters instead. And then as I said, the cost is much lower as well compared to running 500 separate clusters. Just think about 500 clusters with three nodes each. We have 1,500 nodes running. Most of them are going to be idle, especially if you’re thinking pre-production clusters. Most of them are going to run idle most of the time. In a virtual cluster you may get away with having 500 nodes for everybody because they’re much higher utilized. And much more dynamically allocated across your tenants.

Robert Blumen 00:19:06 You raised a lot of points there. There is one thing that I ran across in some of the research that you haven’t mentioned. I’ll ask you about this, the time to spin up. What is the advantage of the clusters in that area?

Lukas Gentele 00:19:22 Oh yeah, that’s sometimes I overlook, but yeah, it’s a big one. vCluster spins up in like six, seven seconds, like super, super quick. Depends a little bit on your configuration. Maybe the heavier ones take 10 seconds. But it’s in that direction. Versus if you were to spin up one of the easiest to spin up clusters today, which are like EKS or EKE or AKS. Public Cloud, they streamlined everything for us. Still these clusters take about like 30, 40 minutes to start. That’s obviously a big difference. And that short start time also allows us to dynamically turn virtual clusters on and off when they’re not being used. If that would take 40 minutes to launch a real cluster, that is not something you do three times a day up and down, ?

Lukas Gentele 00:20:10 But if it takes six seconds and you’re going to go for your lunch break for 45 minutes or an hour or so, we can turn the virtual cluster off while you’re not using it. And that’s actually part of our commercial offering. We call that sleep mode. It’s something that monitors the network traffic to your virtual cluster and turns it off when you’re not using it. And the cool thing is it also turns it on again when you start using it again. So let’s say you run cube CTL get pods, that request comes in and that request hits the network of the host cluster first. The load balance there and we it there because we see, oh the virtual cluster is asleep and we wake it up real quick, which is just launching the control plane, which is just a container. Starting container super quick and then we let the request through. That means that request after you’ve got your lunch or over the weekend Monday morning, that first request may take like five or six seconds instead of 500 milliseconds. But the company saved a lot of money in the time you actually didn’t use this cluster. And that is very, very beneficial for a lot of companies.

Robert Blumen 00:21:12 I understand that the turn on/off the merits of that. Can you think of any other examples where it enables you to do something you or a client or customer to do something because of the fast spin up time that they could not do if it was 40 minutes?

Lukas Gentele 00:21:32 Yeah, when you think of a scenario, and we’ve had a couple of startups do this. A cluster per customer. So let’s say your application launches pods, let’s say you have something like a batch job framework that spins up a part every time a customer clicks a button in the UI or hits an API request. You kind of need to give everybody their own cluster to keep them separate. But spinning up a cluster would take 45 minutes. So you are automatically going to default having a product where you see, you go to the website and it’s like a get a demo and it’s going to take a while to get access to the product because somebody’s got to spin up a EKS cluster behind the scenes and launched a product in there. We have some customers, and we actually gave a demo with a smaller startup at KubeCon last year about this where they were demonstrating, hey, we wanted to have a demo environment on our website launchable for customers immediately.

Lukas Gentele 00:22:31 So when you go to their website and you hit the sign up now link and you type in your email address, it’s going to tell you spinning up your environment and that spinner is going to go on for 10 seconds and then it’s going to drop you into the product. What happens behind the scenes is a virtual cluster gets launched, then the application gets deployed to the virtual cluster and then once that is ready you get dropped into the UI. And that’s a beautiful experience for a customer. You get your hands on the product immediately. And then the other benefit that they’re using is for these trial customers that just sign up and try things out in the free tier. They’re actually also using the sleep mode to turn their product off and you’re not using it and there’s just things that is unimaginable with real Kubernetes clusters. Because no, it’s got to be provisioned and deprovisioned and a whole bunch of other things need to happen. With the virtual cluster and itís kind of dynamic nature spinning up so quickly and turning off so quickly these scenarios become possible.

Robert Blumen 00:23:32 I could definitely see that you could give a demo in a minute rather than an hour as being a big product feature. I want to reflect on another point you raised about large company, and it has a thousand Kubernetes clusters. I recognize that this number a thousand is a number you picked out for the purpose of examples. Let’s say that this organization now learns about vClusters, are they going to have one Kubernetes cluster with a thousand vClusters or what’s the condensation factor that you would get and what is now the criterion for deciding how many traditional clusters you need and then the multiplier on vClusters per traditional cluster?

Lukas Gentele 00:24:18 Yeah, that’s an excellent question. I remember the early days of the container wars. And I think Mesosphere had this goal with DCOS where they wanted to build this giant machine is I think how they call it. So essentially wiring everything up to be one giant machine that you throw things out that sounds amazing. I think with Kubernetes and the way currently things are set up, especially in the most sophisticated systems, even like the public clouds. You still pretty much have regional Kubernetes clusters because of latency reasons. It’s really tough to run even though it’s a distributed system, the reconciliation loop and Kubernetes and there’s just a lot of networking going on. If you were to split that up across the entire world and build one giant cluster. I mean again like that’s not really possible in any of the cloud providers today.

Lukas Gentele 00:25:09 But I’m not sure if that is even a desirable path to be honest. I think what we see most people do is not have one giant cluster but have a handful of very large clusters. So instead of having 500 clusters across four cloud provider regions, you may get by having four clusters in four cloud provider regions. Or maybe you’re saying, okay, we do want to keep Pro and pre-Pro completely separate. So you may have eight clusters but not 500. And I think that reduction by factor of like 10, 20, 30, that is what we’re looking for, but we’re not looking for going from 500 to one. I think that would be very extreme for most enterprises out there, but it is certainly feasible for smaller companies. If you’re a startup and you’re currently running 10 Kubernetes clusters, I bet you will get away with one or maybe two.

Robert Blumen 00:26:08 Are the vClusters assigned a fixed amount of resources such as memory cores and storage space? Or are they somewhat elastic, very elastic, or they can grow as needed based on workload?

Lukas Gentele 00:26:24 Yeah, that’s actually interesting. Because thinking of the analogy to virtual machines, typically virtual machines you pre-provision, you sign a certain amount of memory to a specific virtual machine. With virtual cluster it’s much more elastic and dynamic by default. So by default we’re just using the underlying clusters node and whatever resources are available on these nodes. And if you launch a thousand pods in your virtual cluster and your underlying cluster only has two nodes, but autoscaling enabled, you’ll see that number of nodes in the host cluster go up. And that’s kind of the beauty of Kubernetes and the elasticity that you get in the cloud. But what you can also do in the vCluster, you can obviously restrict the amount of resources that should be allowed to be consumed by that virtual cluster. And you can also reserve certain things for a virtual cluster, but again, by default it starts like completely dynamic.

Lukas Gentele 00:27:19 That’s typically where we’re coming from. And then you are optimizing towards like, okay, let me set limits and we do obviously recommend setting limits for certain things to ensure that one of your tenants is not going completely rogue and putting a lot of strain on that cluster or taking all the resources away. Especially if you don’t have a cluster that auto scales, you’re in the private cloud or you don’t have an autoscaler enabled. You obviously need to manage who consumes these resources to ensure a certain fairness amongst your tenants.

Robert Blumen 00:27:48 Do you have any stories that revolve around multi-tenant system where the tenants either did or did not have resource constraints on the vClusters and what happened?

Lukas Gentele 00:27:59 Certainly the case when you’re thinking about SQL Light, for example, as the backing store for vCluster; we had a whole bunch … and when we started with K3S as the default — so in the vCluster you can run multiple distros, as well: vanilla Kubernetes, K0S, K3S…. We started with K3S and with SQL Light as a default, which is a very, very lightweight setup. And then we saw a couple of people really excited about vCluster who put us in production right away within the first year of launching to open-source. Really brave pioneers, right? I probably wouldn’t have the, to put because production at this point now obviously for sure it’s possible, but three years ago that was a little bit of a scary thought, and we saw a couple of KUBECon talks and people going out there and saying like, we have these 80 virtual clusters in production and our customers running these virtual clusters.

Lukas Gentele 00:28:52 And we were always seeking out these people obviously for their experiences, for their stories, but also to make sure that they know us and know who to call. Because we appreciated them pioneering putting vCluster in production. And there’s one particular incident and they became a, not going to disclose the name obviously, but they became a paying customer later on and they are running over 400 Kubernetes virtual clusters today. But back then they had like maybe 50, 40, 50 and they were running them with SQL Light and then they hit us up at one point and we’re like, our biggest customer, which obviously has the biggest load on their virtual cluster, is really seeing a degraded performance in their virtual cluster. And then we asked them, how would you set up and like, can you explain a little bit more? And they were like, yeah, it’s all like the standard, the default.

Lukas Gentele 00:29:41 And we’re like, oh wow. So it’s SQL Light. It’s a single file database so no wonder that performance is going to get degraded as more API requests come into that Kubernetes API, and then we help them — actually, we have a feature called embedded ET CD, which converts the SQL Light into an ET CD cluster that runs inside the part of the vCluster and is horizontally scalable with the number of pods that you give to this vCluster. And that’s a beautiful way for them to go from SQ Light, which is not scalable at all to a one node ET CD cluster and then to scale it to a multi-node ET CD cluster. And they started rolling that out to actually solve these problems. It’s pretty fascinating, but we’ve seen these kind of war stories.

Robert Blumen 00:30:27 We’ve talked quite a lot about the architecture of the vCluster and what parts are shared, the parts are not. And then you also mentioned that it’s deployed with a Helm chart part we haven’t covered is a cluster is a Kubernetes operator, operator is not something we’ve covered yet on SC. Maybe we could start with the background or on what is a Kubernetes operator?

Lukas Gentele 00:30:53 Yeah, so Kubernetes operators control essentially the objects that get created from custom resource definitions in Kubernetes. So in Kubernetes you really describe your desired state and then the cluster figures out how to get there, what changes need to be made to your infrastructure, to your configuration, to your containers. In order to achieve that desired state, there’s a lot of controllers in Kubernetes that’s a central piece of an operator. Like the replica controller for example. You say I want three replicas of this part, which is a statement and then the replica controller has to create three parts. And that achieves your desired state. And operators and controllers and Kubernetes do the same thing. Typically when you write them you add custom resources to Kubernetes and then you describe what the desired state of these resources should be and then a particular controller essentially achieves that state.

Lukas Gentele 00:31:49 For vClusters, actually, when you look at our simplest way completely in the open-source project, there’s actually no operator involved. It’s really just a Helm chart that creates a stateful set or deployment with the vCluster part definition inside of it. And the regular Kubernetes replica controller, et cetera, takes care of it. But in our commercial product we obviously have an operator, and we have a CRD for virtual clusters, et cetera, which makes it easier to describe that desired state. Do you want a virtual cluster with ET CD backing or connected to an RDS database to store it state there so you don’t have to worry about backing an ET CD cluster up and those kinds of things. It makes it much easier if you have a large fleet of virtual clusters. And one interesting thing we do as well because we acknowledge a lot of people start with virtual clusters in the open-source and they basically do the Helm install. No CRDs involved just in a single name space. Here’s a deployment, here’s data set, et cetera. We have a way to, we call it externally deployed virtual clusters. We just announced that, last week. It’s a very new feature and this feature allows you to add these open-source virtual clusters through the control plane to create the CRDs. So a little bit the other way around, I have a state and then I create the desired state and then obviously they immediately match once that happens.

Robert Blumen 00:33:08 The last part I did not understand. So could you go through that again a bit slower and I’ll try to ask questions along the way if I don’t get it?

Lukas Gentele 00:33:17 Yeah, absolutely. So if you have 100 virtual clusters today and you just deployed them with the Helm chart. So you just have a 100 stateful sets that create a hundred virtual cluster pods. You have no CRD called virtual cluster involved yet, but you want it because you want to make that move to the commercial product because of all the benefits you get there, like sleep mode, the fleet management, you want to use the UI to manage the virtual clusters. What you can do now with this feature called externally deployed virtual clusters, which we added in the newest version of our platform, then you can essentially point the platform at these a hundred virtual clusters and they kind of get imported. And what we do is we create the custom resources based on what’s already in your cluster. Typically in Kubernetes, the loop works the other way around. It’s kind of like, here’s my desired state and then turn it into what actually happens in the cluster. We kind of do it the other way around just to make it easier for people that come entirely from the open-source to export a commercial option as well.

Robert Blumen 00:34:21 I got it. Then the migration path is from the non-operator version to the operator version, sort of like doing a Terraform import, something like that.

Lukas Gentele 00:34:33 Exactly. That’s exactly, a perfect analogy. Yes.

Robert Blumen 00:34:36 Since it’s possible to do this, I would guess there are other people who figured out the same principle as Loft labs. Are there other vendors in this space or other open-source projects in the general space of running Kubernetes on Kubernetes?

Lukas Gentele 00:34:54 Yes, definitely. So when we actually got started, one of the inspirations for vCluster was a project called K3V, which is what Darren Shepherd, the CTO and founder of Rancher put on, I think it was like a weekend project. Put it on GitHub and said like this is how you could put K3S into a pod and run it on top of another Kubernetes cluster. And then we saw that, and we were like, hmm, how about if we took this all the way. I think he went 1% of the way and we were like, what if we went all the way and programmed all the rest that is necessary. This project at that time was about a year old, I think it wasn’t really working anymore unfortunately. But the idea was fascinating, and I think other people got started around the same time too.

Lukas Gentele 00:35:38 There was an effort as part of the multi-tenancy working group in Kubernetes, they built something called the hierarchical Namespace in Kubernetes and they also built something that they called cluster API nested as part of cluster API. And both of these efforts show exactly the same need. You have multiple tenants, they need multiple Namespaces, how do you arrange that? I think neither of these projects are super active anymore at this point, but there have been other efforts coming up. So for example, Redhead launched a project that they call hosted control planes where they’re run a control plane inside a container. So it’s essentially Kubernetes in Kubernetes as well. But what is different than there’s Kamaji which is pretty similar to host a control plane, it’s another open-source project, also launches a Kubernetes control plane in a container. But what none of them do is essentially reuse the underlying host clusters nodes.

Lukas Gentele 00:36:34 So any of these control planes you’re launching, you will have to attach dedicated nodes to them versus with the virtual cluster, you do not have to do that because we swap out the scheduler with our own component. We call it the Syncer. And what the Syncer does is it doesn’t schedule to nodes what like a regular scheduler would do. So it doesn’t need to know nodes. Instead, it synchronizes the state of a pod from a virtual cluster into the host cluster and the status, for example, image pull back off and any kind of events et cetera, it syncs those pieces of information back into the virtual cluster. That’s what the syncer does and that’s really the, I guess the magic sauce and the really, great idea about the vCluster that nobody else is pursuing. Everybody else runs a control plane inside a pod and then has you attach nodes to that whole plane dedicated nodes.

Lukas Gentele 00:37:31 Which is still a great benefit because you don’t have to run so many control planes. You save a lot of nodes just for control planes or you save a lot of like there’s a fee attached spinning up a cluster in a public cloud. You save a lot of that headache. With this approach, which is great and especially in the private cloud is a huge step forward. Where you don’t have so many dedicated control plane nodes available. You have one control plane cluster with a couple of nodes and then you have all your worker nodes that are separate. But with the vCluster we take it one step further. So I would say use case wise where really optimized for that multi-tenancy inside a shared truly shared cluster rather than just hosting control.

Robert Blumen 00:38:14 In the model where you attach the nodes to the Kubernetes vCluster or cluster within a cluster, then did I understand you would run a scheduler within the nested cluster and it would schedule on the attached nodes?

Lukas Gentele 00:38:32 Yes, that is true. So what we do is the syncer essentially would take care of, so that you can essentially tell us should this pod end up on your regular synced flow where it goes down to your host cluster and then your host clusters scheduler would take care of scheduling it to a node or you can say, hey, this particular part, I want that to be on one of my dedicated nodes and then a scheduler would schedule it. So in that case you actually have a scheduler and a syncer running in the same virtual cluster.

Robert Blumen 00:39:04 Okay. You did say earlier that you optionally may attach nodes to the vCluster in your model, but it’s not required as it is in some other product.

Lukas Gentele 00:39:17 Yeah, that’s correct. It’s for that scenario. So you have either, I think there’s three modes. It’s either completely separate nodes for each control plane, that is what Kamaji hosted control plane, et cetera allow you and a regular Kubernetes cluster frankly as well, dedicated nodes. And then you have the other extreme completely shared nodes, which is what the vCluster does by default with the syncer where the pods just get synced and the underlying cluster takes care of. And then the vCluster allows you this beautiful thing in the middle as well, which is this little bit of a hybrid approach to say like, hey, this is what I want dedicated and this is what I want.

Robert Blumen 00:39:54 Okay. And then the secret, if I understand it, it observes the scheduling that is taking place on the host cluster and subscribes or receives events or in some other way understands where it’s been scheduled. So then it could provide that information back to the control plane of the nested cluster, which needs to know where the service is running.

Lukas Gentele 00:40:18 That is correct, yes. So the virtual cluster essentially monitors the underlying pod and in Kubernetes is typically two parts of an object that, or maybe three. There’s the metadata, which is the thing I first forgot. And then there’s the spec where you describe your desired state and then there’s a status which typically the controllers use to write down things like which note did this part get scheduled on and was the containers successfully started or that’s where you also see the relation to events in Kubernetes. Every time a container starts or container crashes or something gets rescheduled Kubernetes and a lot of other controllers there too, we do that too in our commercial product. You emit events. So we also subscribe to the events that get emitted for that particular pod. And that way we can essentially import these things into the virtual cluster again so that you get the observability because let’s think of the scenario, I’m launching a pod and you have the classical scenario image pull back off because you didn’t give it like the credentials for your private registry or something like that.

Lukas Gentele 00:41:26 Or you mistype them or, anything went wrong in that process. Then you need to see that that image couldn’t be pulled. And you also need to see once that container maybe started but then crashes five minutes later because I don’t know, it could be an OM kill or something like that. That has a memory leak. Like there’s so many things could happen. You just need to be able to see that in the virtual cluster the same way as you see it as if you had your own. That’s overall our goal. We essentially want to make sure that when an organization or a platform team hands out virtual clusters to their tenants that these tenants don’t even realize that they’ve gotten a virtual cluster. Same way as if I give you an EC2 instance, may not even realize you don’t have a bare metal server.

Lukas Gentele 00:42:12 That transition should be relatively seamless unless you do something very, very specific deep down with the hardware. But at 1% edge case then you definitely notice. But 99% of cases you won’t notice are you in a VM or are you in an actual physical server? The same counts for virtual cluster and that’s why we’re also aiming to, with every release we’re doing, we’re passing the CNCFs performance tests for Kubernetes. That means we’re a certified Kubernetes distro and that’s very important for us to be able to communicate that to our user base. Because that creates a lot of trust. hey, I can point my CI pipeline that was previously pointed to deploy to a real cluster. I can now point it to a virtual cluster and nothing breaks. And that is the goal. You shouldn’t have to re-architect your application because you’re now using a virtual.

Robert Blumen 00:43:02 I have read that the thing which constraints how large a Kubernetes cluster can get is ET CD, everything else is pretty much stateless. But ET CD has a single leader and all writes have to go through the leader. You can only write so fast onto a storage device so that ends up being the thing which constraints how big a cluster can get. I read something while I was researching this episode. It said this may be helped by vClusters because you may be able to offload some of the writes onto the storage of the vCluster. But now in light of our past few minutes discussion about how some of the activity is actually done on the host cluster. What is your view on whether these vClusters can truly increase the scale of a host cluster?

Lukas Gentele 00:44:00 That’s an awesome question. Yeah. When you talk about a large Kubernetes cluster, definitely? The API server, the ET CD, there’s so many, many things that are under heavy load at a certain point. And I think when you think of the virtual cluster, the load is now first on the virtual cluster, instead of the underlying host cluster. The beauty of that is despite the sinking that happens, the syncing happens only where it’s necessary. So let’s say you have a lot of controllers and a lot of CRDs. Think of a CRD, think of a controller like cert manager that provision certificates. Does that launch pods? Does that need networking? Not really. It is a lot of objects, a lot of certificate objects and certificate request objects and all of these objects and all of these objects and there’s so many requests going to the Kubernetes API by that controller watchers.

Lukas Gentele 00:44:57 Every time a certificate gets created or deleted; something happens in the Kubernetes API. But we don’t sync any of that. That stays within your virtual cluster. Only when you launch a pod where we actually say a container needs to be launched on a node, that’s when we need to use the syncer. And the beauty is most of the requests in a Kubernetes cluster are not pod related requests. Even when you think of what created that pod, well what created that pod is first, let’s say a deployment with a replica number of four. What happens is first you have a cube CTR request to create that deployment and then you have the controller manager now creating four pods, which is four more requests. And then you actually have the launching of these pods. For us we only have to launch pods. So you’re saving the create the deployment and a lot of other things that are higher level resources and companies, all the CRDs, a lot of CRDs ultimately create a pod. But what happens beforehand that CRD interaction et cetera, that doesn’t need to be synced. Just a pod needs to be synced and kept in synced. That means the underlying cluster is going to have a lot fewer requests. And defacto, you’re that actually leads to a defective sharding of the Kubernetes cluster and actually makes the cluster more scalable than it would be without that layer of, the virtual cluster on top.

Robert Blumen 00:46:19 Would the vCluster typically take as much trouble to be highly available as the host cluster? And if you lose a vCluster, you have your persistence, can you recover it straightforwardly?

Lukas Gentele 00:46:34 Yeah, I think we got to a differentiate between the public and private cloud here. If you are in a private cloud, you have obviously a lot more responsibilities on your own. But if you’re in the public cloud, you have it much, much easier with a vCluster where you can essentially say, hey let me make this vCluster highly available by offloading its state to something like a RDS in AWS or any kind of like hosted MySQL or Postgres database. So you don’t even need to use a net CD cluster that you kind of spin up yourself. Or you don’t need to use like SQL Light or embedded ET CD which is also something you need to back up. But if you put it in RDS, well AWS is going to maintain that for you, make that super highly scalable. And to your point earlier about the giant machine across the world.

Lukas Gentele 00:47:22 In terms of like having just one Kubernetes cluster, the beauty of using things like global RDS because like global databases is pretty much a thing already in cloud providers. If you’re using global RDS as a charge backend for a vCluster, you can move a vCluster from one cluster to another cluster, which is actually really interesting for things like failover scenarios. The only thing admittedly I have to mention here is the persistent volumes. They still live in that cluster. So if you have anything that has really like persistent volumes attached that are really important and not just get created for caching or other reasons. Then that can be recreated. Sometimes persistent volumes are just to survive container restarts for a certain amount of state. But if you have really important state that cannot be recreated, then obviously that’s a separate migration process.

Lukas Gentele 00:48:14 However, if we are actually working on something, we call that snapshotting of vCluster, kind of like snapshotting a VM, which will even let you allow it to snapshot the persistent volumes that are attached to vCluster. So it’s a pretty interesting model. If we’re talking about the private cloud, then HA and resilience for real clusters obviously is a huge burden for people and for vClusters it may make it a little bit easier because you have fewer of these real clusters to manage. But you have to worry about the state of these vCluster and we do help you with our commercial solutions. And if you do have ways to, use managed databases or you’re much more comfortable with running like MySQL and Postgres databases et cetera, then like relational databases, then you are with maintaining and running an ET CD cluster. Which admittedly a lot of people are, because we’ve been doing like relational databases since forever I feel like. So a lot of IT teams have really resilient frameworks for running relational databases then it actually becomes a lot easier than running a real cluster.

Robert Blumen 00:49:20 One other area before we reach end of time I was unclear on while researching this, is if in a vCluster you want to run a workload and put it on the public internet, I understand from earlier discussion that the vCluster does share the host clusters network. So would you have to add now an ingress object and whatever other networking resources to get from the host cluster to the public internet or what’s the, what are the building blocks to get the vCluster service to be on the public internet?

Lukas Gentele 00:49:57 Yeah, there’s roughly two routes and it’s the same two routes as pretty much with any other CRD and controller. For example, in Kubernetes you can always say I want this entirely separate for the entire vCluster or I want this shared with other vCluster. So if you run, let’s say you want it completely inside the vCluster and you want to create an ingress rate, then you can essentially launch an ingress controller inside the virtual cluster. Now I do have to mention that would mean you have to allow the vCluster to provision load balancer, which is sometimes not desired. Because each load balancer has a certain fee attached to it. But then you have essentially your Nginx running for that vCluster separately and now you can create ingress in the vCluster. However, the more popular approach is actually we call that shared platform stack.

Lukas Gentele 00:50:51 Certain components you want to run in the host cluster. Like for example, an ingress controller, maybe an Istio service measures, very popular one. Obviously maybe something like open policy agent as well. Like a lot of like security monitoring, logging, Prometheus for example. Where you’re saying, hey, this should be used across all version. That means you can run it in the host cluster and what you can enable in the vCluster config, similar to pod syncing, you can enable syncing for that particular resource as well. So let’s say you want to sync ingresses because you want to have Nginx ingress controller and maybe cert manager run in the underlying cluster. So you get automatic certificate provisioning, you automatically have ingress in the underlying cluster. Only thing you need to tell the vCluster is make it easier for your tenants. To self-serve your tenants.

Lukas Gentele 00:51:44 You allow them to sync ingress. That means when a tenant creates an ingress, that ingress gets synced down similar to a pod. And then the underlying clusters, Nginx ingress controller is going to handle that ingress and the cert manager is going to add a certificate for it, et cetera. And you’ll see all of that status back in the virtual cluster. And that’s how you would expose the virtual clusters services running inside the vCluster to the public internet. And obviously that saves a lot of resources, a lot of load balancer resources, et cetera, compared to running separate clusters and having 500 separate load balancers and Nginx ingress controllers running, et cetera. And you can again, mix and match. So let’s say for 499 of your V classes is totally fine to use a shared one, but then one of your teams wants to test the bleeding edge version of Nginx. You’ll allow them to provision one load balancer and they run their owning risk controller. That’s really the beau beauty of it. You can mix and match these approaches.

Robert Blumen 00:52:44 I get that you have a lot of flexibility and the more that you share things across 500 vClusters, you have resource savings. In fairness, I suggest that would reduce the benefit of isolation, which is one of the, one of the benefits of vClusters. The more that you share, the less isolation you have. Is that fair?

Lukas Gentele 00:53:06 Yeah, I would say that’s fair. Yeah. If you start with a vCluster and you — for example, say we have shared nodes completely, all of the nodes are completely shared. There are things like container breakouts that you have to think about. And that may be totally okay for reproduction and test and dev environment. It’s very unlikely that there’s a malicious actor. Everything is behind your VPC. Within your company’s network. And it may be totally okay to share a node, but when you’re thinking about production instances, you may want to actually give your customers dedicated nodes or a specific dedicated controllers. There’s obviously an evaluation that has to be done for each individual use case and for each individual controller and what you want to share and what you don’t with regards to the nodes, that’s a question we get a lot.

Lukas Gentele 00:53:54 For these production scenarios. There are things you can do though. There’s technologies that cut our containers and firecrackers, et cetera, technologies that let you provision micro VMs or essentially more hardened containers to make it much easier to go with this shared model. Or even things like you can run GKE autopilot where you don’t even see any nodes anymore. Or you can run EKS on top of Fargate, which is an option that people don’t even know about sometimes. And you could say this Fargate path is great for hosting our application, but then we’re going to launch pods from there in another Kubernetes cluster that essentially has regular nodes. So there’s a lot of flexibility and it really depends on your particular use case, but you’re definitely right. The more you share the blurry the lines get in terms of isolation, of course.

Robert Blumen 00:54:46 We’re close to end of time. Were there any key takeaways about vClusters we haven’t covered that you want the listeners to know about?

Lukas Gentele 00:54:54 I would say just try it out. It’s really easy to get started with. I think sometimes we talk about very complex things in this show, Robert. I think you asked some really good questions. We went pretty deep on a lot of topics, but anyone who’s listening to this, don’t let that scare you off. To get started with it is really easy. You download the CLI you run vCluster create and it spins up a virtual cluster in a Namespace of your Docker desktop cluster or your Minikube or whatever you have running locally. And suddenly you can spin up 20 virtual clusters. You have 20 clusters running now in your local home lab. Something that probably wasn’t possible beforehand. You can group things by project, you can run one purple request as a preview environment.

Lukas Gentele 00:55:38 There’s so many interesting things you can do over vCluster and the barrier to entry to start it is really easy. It’s completely open-source. It takes one command to spin one up and it takes, as we said earlier, like six, seven seconds for it to be ready. So yeah, you can obviously dive very, very deep into the architecture and the underlying infrastructure and I encourage everybody to look at the docs if they want to know the specifics of any of these topics. But to get started is super easy. And maybe one more shout out if you’re interested in joining the community, we have a Slack community with about like 3,500 members. So just head to vCluster.com and hit the join us on select button and hope to see a lot of you there.

Robert Blumen 00:56:19 Lukas, would you like to point listeners anywhere on the internet? Either you or Loft Labs?

Lukas Gentele 00:56:27 Yeah, you can definitely find me on LinkedIn, on X and the usual social media channels. Feel free to connect. I’m pretty approachable. I get back to everybody where I try to and yeah, just reach out. Other than that, obviously youíll find the open-source project and also our other open-source projects, definitely worth checking. Our DevPod for example as well, a project that we launched last year, very popular. It’s like a GitHub code spaces alternative. If you want to run something that code spaces, but you want to maybe run it with GitLab or you run it, run it in your private cloud. Or you want to it run it in AWS. Those are things that are possible with Dev pod. It’s a very exciting project as well. You’ll find all of that in our GitHub. So just check out our LoftLab-sh GitHub and you’ll see all the repositories there. There’s quite a few more than the couple ones I just mentioned DevPod in the cluster.

Robert Blumen 00:57:18 We’ll put that all in the show notes. Now, we at end of time Lukas, I want to thank you for joining Software Engineering Radio.

Lukas Gentele 00:57:26 Thank you so much for having me. This was fun.

Robert Blumen 00:57:29 This has been Robert Blumen and thank you for listening.

[End of Audio]

Join the discussion

More from this show