John Purrier talks with Jeff Meyerson about OpenStack, an open-source cloud operating system for managing compute resources. They explore infrastructure-as-a-service, platform-as-a-service, virtualization, containers, and the future of systems development and management. Cloud service providers like Amazon, Google, and Microsoft provide both infrastructure-as-a-service and platform-as-a-service. Infrastructure-as-a-service gives developers access to virtual machines, servers, and network infrastructure. Platform-as-a-service is the software that runs on top of that infrastructure, such as Amazon DynamoDB, Microsoft Azure Machine Learning, and Google App Engine.
John’s Twitter account: @johnpur
OpenStack homepage, https://www.openstack.org/
Wikipedia article on virtualization, https://en.wikipedia.org/wiki/Virtualization
Wikipedia article on hypervisor, https://en.wikipedia.org/wiki/Hypervisor
Wikipedia article on job schedulers, https://en.wikipedia.org/wiki/Job_scheduler
Article on the Message Bus pattern, http://www.enterpriseintegrationpatterns.com/patterns/messaging/MessageBus.html
Transcript brought to you by innoQ
This is Software Engineering Radio, the podcast for professional developers, on the web at SE-Radio.net. SE-Radio brings you relevant and detailed discussions of software engineering topics at least once a month. SE-Radio is brought to you by IEEE Software Magazine, online at computer.org/software.
* * *
Jeff Meyerson: [00:00:38.03] OpenStack is an open source cloud operating system. John Purrier is a founder of OpenStack and the CTO of Automic Software. As Automic’s Chief Technology Officer, John is responsible for driving the organization’s automation strategy.
With more than 20 years of IT industry leadership experience, John was most recently CTO of CenturyLink Innovations Lab, focusing on new and emerging cloud technologies, including multi-cloud interoperability and DevOps. Previously, John was strategic and technology leader at AppFog, and he lead the development and delivery of the first three releases of Microsoft Exchange Server. He also led R&D for the Rackspace cloud.
John Purrier is a founder and board member of OpenStack. John, welcome to Software Engineering Radio!
John Purrier: [00:01:23.26] Thanks for having me, Jeff.
Jeff Meyerson: [00:01:25.00] What is OpenStack?
John Purrier: [00:01:27.07] As you said, OpenStack is an open source project; it’s a cloud operating system. Originally, the idea behind OpenStack was to create an open source Infrastructure as a Service (IaaS) platform. The original founders were Rackspace and NASA, and the two original projects were a compute project and an object storage project.
Since then, in the last six years, the numbers of projects and the scope of the projects has grown well beyond that original vision.
Jeff Meyerson: [00:02:07.27] To level set for some of our listeners who may not know the term, what is Infrastructure as a Service?
John Purrier: [00:02:11.27] The way to think about Infrastructure as a Service is instead of going out and buying your own computers, finding a building, getting a lease, hooking up power and air conditioning and running a data center, you’re essentially leasing that from somebody else.
While you can have projects that are private Infrastructure as a Service, when we talk about IaaS, most people are referring to public clouds like Amazon, Azure or Google Cloud. The simple way of thinking about it is I’m essentially renting servers from somebody else.
Jeff Meyerson: [00:02:53.20] We have these public clouds (Amazon, Google, Microsoft Azure), why do we need an open source cloud operating system?
John Purrier: [00:03:08.00] That’s a great question. I guess the question could be, “Why do we need any open source versions of proprietary implementations?” I would argue that, first of all, it’s just good for the industry and the ecosystem, but in particular around cloud. Cloud is very collaborative. It’s a coming together of a lot of different companies and organizations, and the ability to collaboratively build a large project like this is very powerful.
[00:03:47.02] Very few companies in the world could launch and sustain a project that’s as large as OpenStack is today. The last I saw, there were over 30,000 members of the OpenStack Foundation contributing toward the betterment of the software, and that’s a great thing.
Jeff Meyerson: [00:04:08.28] So it is this open source Infrastructure as a Service operating system. Could you talk about an example of a company that uses OpenStack today, so we have a prototypical use case that we can refer back to during this show if we need to?
John Purrier: [00:04:27.15] Yes, there are several. When I say several, I mean hundreds of companies that are using OpenStack today in one form or another. A couple that are poster children – one on the commercial side is Wal-Mart. Wal-Mart is using OpenStack inside their data centers as they build out their e-commerce systems.
[00:04:53.21] The other one is CERN, the research laboratory that does the large-scale Collider in Switzerland. For anybody who’s been to any of the recent OpenStack design summits, both of these companies have come forward and talked about their use cases and demonstrated how they’re doing large scale deployments of OpenStack.
Jeff Meyerson: [00:05:21.09] Do these organizations have their own servers and they run OpenStack on top of it, or are they leasing cloud hosting that they put OpenStack on?
John Purrier: [00:05:34.11] It’s a great question. The answer is, depending upon the organization, they may do either of those or both. The typical installation when you think of OpenStack is an on-premises cloud, a private cloud. I have my own data center, and I’m probably virtualized. Over the last ten years or so I virtualized using VMware or other virtualization technologies, and I really want to provide more service, more automation and more responsiveness to my internal customers. So cloud really is an automation layer over the top of virtualization. It allows me to control on-demand spin up of resources, whether they be virtual machines, storage systems, networks etc.
[00:06:33.08] When you think of OpenStack, most people are looking at the use case where it’s used inside of a data center. Also, with no change in model, it can be managed by somebody else. If I was an enterprise and didn’t want to run my own data centers but I contracted one of the managed hosting companies and said, “Please run my infrastructure on my hardware networks”, OpenStack is also appropriate in that case.
[00:07:07.02] Then there are some cloud service providers that run OpenStack and are competing in the public cloud space against Amazon, Microsoft, Google etc.
Jeff Meyerson: [00:07:24.05] We should get into how OpenStack works. We have a higher level perspective for some of the use cases and how it’s deployed at a high level. You mentioned the term virtualization – what is virtualization?
John Purrier: [00:07:39.10] Back in the old days you used to buy a computer, you would put an operating system on it, and then you would run one or many applications on that operating system. About 15 years ago, this idea of virtualizing the servers – which really grew out of techniques that had been around for a long time in the mainframe, in the minicomputer world – was brought to PC-level servers. The idea behind virtualization is I can take a single computer, I can put a hypervisor on it, and what the hypervisor does is it abstracts the underlying hardware, and then it presents the ability to run multiple stacks of software, from the operating system up, independently.
[00:08:38.23] Essentially, I can take my one physical computer and make it look like two, four or eight virtual computers. If I’m a user of those computers, they don’t look any different than if I was talking to a box that I could physically touch.
Jeff Meyerson: [00:08:58.09] What is the relationship between the hypervisor and the hardware that it sits on top of?
John Purrier: [00:09:05.01] The hypervisor is a layer of software that knows what the physical hardware looks like, it knows what the hardware looks like underneath, whether it be the CPU, the storage system or the network. Then it creates driver-level endpoints that can be accessed by a variety of different virtual machines.
Jeff Meyerson: [00:09:50.10] Just to drive home what the hypervisor is a little more — you just referred to the relationship between the hypervisor and the hardware below it… Talk a little more about the relationship between the hypervisor and the virtual machines that it manages.
John Purrier: [00:10:07.22] The hypervisor allows you to spin up virtual machines. Those virtual machines look and act just like physical machines. You have virtualized drivers that provide the same services that a physical machine would do in terms of access to the CPU’s data storage networks etc. It really is a translation layer between this virtualized environment and the physical hardware, and it arbitrates requests from a variety of different virtual machines, down to the physical hardware.
Jeff Meyerson: [00:10:50.20] Perfect. Now that we have an idea of hypervisors, virtual machines, the hardware that all of this is sitting on top of, OpenStack takes a collection of hypervisors that are spread across a data center (or across multiple data centers) and it turns this collection of hypervisors into a shared pool of resources. Why is this useful?
John Purrier: [00:11:13.27] The way you described it was very good. A cloud really is an automation layer over virtual machines. If you take a look at the history of data centers in the enterprise, we started off 20 years ago with client-server architectures. Then about 15 years ago we had the rise of virtualization, and pretty much over the last ten years or so, most serious computation shops have virtualized. That didn’t solve the problem in terms of delivery to the organization.
[00:11:57.02] The processes that enterprises had for a developer requesting a virtual machine – or a developer requesting a group of machines – was still back in the “Submit a ticket. Have somebody go into a console/command line. Create the resources. Update the ticket. Send e-mail back.” This could take several days.
About ten years ago, Amazon started Amazon Web Services. The key service at the time was EC2. EC2 essentially was an automated way of getting virtual machines instantly.
[00:12:44.10] This was very revolutionary. I could take my credit card, I could swipe it and they could charge me, and immediately I could say, “I want ten virtual machines, I want them to have this much memory, this much disk… Go!” Within three minutes or so, those machines would be up and running and I could use them.
‘Inside the enterprise it’s taking longer to get to that, but what OpenStack and what cloud software does is it provides that level of automation over the top of virtualization. It’s really a manager and orchestrator of virtual machines and resources.
Jeff Meyerson: [00:13:28.11] That’s a great description. To reiterate what you just said, with this shared pool of resources you get this big pool, and there’s an interface on top of it that allows the developer to not have to worry about what’s going on; when the developer says, “I wanna spin up some amount of compute, and I don’t want to have to worry about a lot of this configuration beyond the minimum that I should have to worry about as a developer. I can just specify what I want, and this huge pool of resources will figure out under the hood what to give me in actuality. I will get a virtualized, perfect plate of resources that I need.”
John Purrier: [00:14:24.03] Yes, that’s a really good way of looking at it. As a developer – if you go back two generations – I used to have to go buy a physical server, unbox it, plug it in, put the operating system on it, put my tools on it and find a place under my desk. When we moved into the virtual world, all that got taken care of for me. I still had a one-to-one relationship with the virtual machine. With cloud, I just ask for the resources that I want, and I don’t care if it’s on machine 1 or machine 1,000 inside of the data center. As a developer, I just want access to the resources. We’ve really abstracted away the developer having to care about a lot of the provisioning pieces.
Jeff Meyerson: [00:15:20.21] To put a final point on this, the shared pool of hypervisors that is managed by OpenStack – these hypervisors can be of different types. They can be Xen, KVM, VMware or Windows Server, but the user interacts with those hypervisors through a consistent interface of OpenStack. Why is this advantageous?
John Purrier: [00:15:46.28] Because in the real world you don’t necessarily have a homogeneous set of anything. You may have physical computers from a variety of different sources, the original provisioning may be different, you may actually have them on different types of networks… The real world is messy, and what OpenStack does is it provides an abstracted interface over the top of the messiness and then manages the complexity underneath it, so that as a developer, an operator or sysadmin of the system, I don’t really need to worry about that, the system will take care of it for me.
Jeff Meyerson: [00:16:35.10] Do OpenStack users in practice tend to have a heterogeneity of these different server types, like KVM, VMware and Windows Server, all within the same cluster?
[00:16:48.12] When a hypervisor is sitting under this consistent layer of OpenStack along with other hypervisors, are there any particular specs that the hypervisor has to adhere to? In terms of storage, CPU or other characteristics.
John Purrier: [00:17:07.27] It’s a good question. I don’t know if it’s really relevant. Yes, there’s probably some minimum amount of things that you have to have, but if you take a look at any modern hypervisor virtualization systems, whether it be VMware, Hyper-V, or whatever it is, they are all good enough to be driven by OpenStack. OpenStack can also drive bare metal boxes, as well. A bare metal box is a box without a hypervisor.
Jeff Meyerson: [00:17:53.26] To ask an absurdist question, what would keep me from running OpenStack across a huge cluster of Raspberry Pi’s?
John Purrier: [00:18:07.22] You absolutely could do that. If I was gonna set that system up, I would not put a hypervisor on the Pi device, because it’s so small. You essentially run those as bare metal boxes, but each of those would be individually addressable compute endpoints to the OpenStack scheduler.
Jeff Meyerson: [00:18:31.25] That’s really interesting. Not to get too far off course, but do you think we could have a future where I have a data center in my closet and it’s just things that are the size of a Raspberry Pi and I can run huge MapReduce jobs on them?
John Purrier: [00:18:46.09] Absolutely. If you’re paying attention to the hotness around Docker and containers, that’s really helping to drive us towards that future. Containers are much more lightweight than virtual machines, and you can get a lot higher density inside any particular configuration.
For instance, if we took a Raspberry Pi device and we consider that a bare metal box, and then we stacked a whole bunch of Docker containers inside of it, think about the compute power you could have there, and then cluster them out underneath an orchestrator like OpenStack.
Jeff Meyerson: [00:19:34.05] This is getting pretty far removed from the nature of OpenStack discussion, but I’ve done a number of interviews with people where I’ve ended up asking the following question. When you really start to containerize your infrastructure and you’re using Docker – what is the effect of the economies of scale when we can containerize our architecture with more modern technologies, whereas maybe five or ten years ago we were doing virtualization? Do you have any statistics or numerical projections for how economical it actually is?
John Purrier: [00:20:20.02] I can answer that anecdotally. I ran the engineering team at Rackspace, and that’s where we founded OpenStack with NASA. I then went to Hewlett-Packard and stood up a public cloud based on OpenStack. My next gig was a company called AppFog, which was a Platform as a Service company based on Cloud Foundry. This is yet another level of abstraction; for a developer it’s a wonderful environment, because you just worry about the logic and the components that you want in your application, and then you push it into the system and all the operational pieces get taken care of for you.
[00:21:11.09] Under the covers of any PaaS system, they’re all containerized. If you take a look at OpenShift, CloudFoundry etc., all of these systems are running containerized systems. At AppFog, we were taking virtual machines that we were running in a variety of different public cloud infrastructures like Amazon, Rackspace, HP etc. and we were slicing those virtual machines up to run containers.
You could take an 8-gig virtual machine and run 8 to 10 different containers inside of that. So you get much better economics, you get much better density. If you have an application that’s 512 MB, you don’t need to take an entire 8-gig virtual machine just to run that; you can stuff more applications into the same space.
[00:22:11.11] What we’re seeing is that with Docker and containerized systems it’s really an inside-out Platform as a Service. Docker Inc. basically took the containerized system inside their PaaS and they separated it out, made it standalone, put an API on top of it, and they’ve done a lot of really good things in terms of collaborative development for developers and containers.
With these containerized systems we’re now having to rebuild all the stuff that made PaaS’s operational – monitoring systems, health checks and things like that. It’s a really interesting world, but the short answer to your question is that you’re going to get 5x-10x better utilization. That means and requires that our schedulers are more intelligent, as well.
Jeff Meyerson: [00:23:14.21] Great. We’ll get into the discussion of schedulers a little bit later, but let’s zoom out to a higher level. OpenStack has a set of design tenets. The two main goals of OpenStack are scalability and elasticity. To somebody who doesn’t have a lot of experience with building Infrastructure as a Service software, scalability and elasticity may sound like the same thing. What is the difference between scalability and elasticity?
John Purrier: [00:23:53.08] That’s a great question. If you take the scaling question first, it really is the ability to get resources on demand. Let’s say I build an application and I’m selling something; I’m running and I have a certain amount of traffic to it. Suddenly, it’s Christmas time. The traffic to my website goes up 10x. I either have to have over-provisioned earlier, to account for the fact that I now have ten times the amount of traffic, or when that traffic shows up, I want to scale out – I want to add resources on demand.
The goal, the tenant of OpenStack is to be massively scalable. An OpenStack system can run hundreds of thousands of virtual machines in a cluster.
[00:24:56.06] Elasticity is very similar, in that you want to be able to grow your resource pools, but you also want to be able to shrink them. This is actually a very key part of cloud. If it was just “Add, add, add!” that would be one thing, but the fact that I can increase my resource pool on demand, I can shrink it on demand, I can do scaling both up and down is really what we mean when we say it’s elastic.
Jeff Meyerson: [00:25:33.01] Another design tenet of OpenStack is that any feature that limits the main goals of the service (elasticity and scalability) must be optional. It’s actually really important to have these principles laid out, because it’s such a massive open source project and you need to have some sense of alignment among the distributed team. What is an example of a feature that had to be made optional within OpenStack because it potentially limited scalability or elasticity?
John Purrier: [00:26:17.29] If you take a look at the original networking options that we had – flat networking, and a variety of different modes of networking that you could choose, that was interesting. But where networking got really interesting was when this software-defined networking came into being and really spearheaded within OpenStack by Nicira, who were later acquired by VMware. Suddenly, we had two different networking systems, and you as an operator or as somebody who’s going to deploy OpenStack, you had to choose between the two of them. We as a group had a policy of saying, “Okay, you must now use software-defined networking versus the built-in networking modules.” That was a really retarded option, but the fact that you could choose and they were optional components made it much easier for adoption. People could choose the architecture, the mode and the style of deployment that they were comfortable with.
Jeff Meyerson: [00:27:41.23] Everything in OpenStack should be asynchronous, according to the design tenets. Why is asynchronicity so important?
John Purrier: [00:27:51.25] If you make things synchronous, you have contention, you get blockage and your performance goes down. It’s all about throughput, it’s all about being able to create computing systems, data centers etc. that you can put a service-level agreement on. That’s particularly important for enterprise adoption, where the enterprise IT department is making an SLA promise to the business that there will only be so much downtime, that there are certain performance levels that will be maintained etc.
[00:28:36.28] If you have synchronous places in your architecture, your workflows or your systems, that’s where things are going to bog down and that’s where you lose control. It really is all about maintaining performance, uptime and scale.
Jeff Meyerson: [00:28:58.17] OpenStack also emphasized the importance of a shared-nothing architecture. To somebody who doesn’t really understand this term, it might sound strange. If you have an architecture where nothing can be shared, how do you convey information from one piece of the architecture to another? Maybe you could define the term “shared-nothing architecture.”
John Purrier: [00:29:22.23] If I have a shared architecture, I may be using a chunk of memory or a chunk of disk to maintain state. This gets very problematic when you have virtual machines that are ephemeral. A virtual machine can disappear on you, it can go away at any time, so you have to architect your system such that it’s okay if virtual machines go away. If you’re using shared state and you’re dependent upon this virtual machine to maintain its view of the world and it goes away, that’s a problem.
[00:30:08.29] The way you get around this is you use things like message buses. Everybody’s communicating on a message bus, you can subscribe to certain events, so you see state changes and things like that. Each virtual machine has its own view of the world and it operates on that view, independent of everybody else’s view.
Jeff Meyerson: [00:30:36.21] Hypothetically, if OpenStack would have been foolishly designed in a way that shared state was easy to do, why would shared state be dangerous? What is an example of maybe an application-level bug that could propagate from a situation with shared state?
John Purrier: [00:30:55.18] If we’re sharing state, I am dependent upon the thing that’s maintaining the state to be correct. I’ll give you a good example. When I first went to Rackspace to do the cloud engineering work, they had a compute infrastructure based upon a company that they had acquired called Slicehost, and they had built an object storage system internally. (Before my time) When they went to launch the object system, in the first week it took a lot of traffic and it went down, and it took them several days to bring it back.
[00:31:44.28] The problem in the design of that object storage system was they had put a centralized database. As your storage system, you have many Distras, and if you’re routing all your traffic and it’s a requirement that you do asynchronous calls with a database for each transaction, you can see how your scale is going to be impacted; the more people that are trying to talk to the system, the more contention you’re going to have on that shared resource, and the worse off you’re going to be, which we found to our chagrin.
[00:32:28.28] The good news is that we learned our lesson from that, rebuilt from scratch the object storage system, and that’s the system called Swift that’s in OpenStack today.
Jeff Meyerson: [00:32:40.28] Great. Let’s talk about eventual consistency. OpenStack is designed with eventual consistency in mind. Could you define the term eventual consistency and describe why that term is relevant to OpenStack?
John Purrier: [00:32:56.25] Yes, it’s absolutely a key concept for large-scale distributes systems. The idea behind this is that if you had a lot of systems — I’ll take the storage system again, as an example. Let’s say I build out my storage system and I’ve got ten thousand Distras; the way object storage systems work in some instances is by doing redundant copies. I write a piece of data and it actually gets splattered across the ten thousand drives, and there’s at least three copies. If at any time a drive goes away, I know I’ve got two other good copies that I can recover that from.
[00:33:57.22] The idea though is that we have latencies in systems, and we have delays in systems. If I’ve got a program that’s looking at a particular bit of storage and I’ve got another program that might be geographically dispersed, but on the other side of a replication topology in the storage system, at any point in time they both think they’re looking at file Foo, but Foo, for some amount of time before the replication occurs, is not the same on both sides. And that’s okay. That’s the whole idea behind the eventual consistency.
Eventually, everything will come back and be consistent, as the replication works, but we have to accept the fact that in some time windows, program A and program B will get different results querying what they think is the same storage object.
Jeff Meyerson: [00:34:57.01] OpenStack has some degree of tunable consistency also. Could you explain that term and explain how the level of consistency within OpenStack can be tuned?
John Purrier: [00:35:10.25] This was something that was added after my time. The tighter that you bring the consistency requirements… You’re trading off resiliency. You could make a system that is entirely consistent. If we take eventual consistency on one end and always consistent on the other, you can build systems that operate in either of those modes. With OpenStack, the original design parameters for some people were not adequate.
[00:35:52.21] This is the great thing about open source. As part of the project, the folks that said, “You know what? I don’t necessarily need this many copies, but I need them to be in tighter sync. I need those two programs to think they’re looking at the same storage object, to always get the same answer if they ask the same question.” That was added in as a normal part of the open source development process.
Jeff Meyerson: [00:36:22.15] Now that we’re talking about open source, could you give me an idea of how the open source community of OpenStack works and how people collaborate with each other, how the open source progress progresses?
John Purrier: [00:36:39.03] Yes, I love this topic. Open source is a tremendous movement. A lot of the innovation in how we collaboratively build software has come from projects like OpenStack. Large communities of folks that are spread across many countries and cultures, across many different time zones, all collaborating on a single project has forced a lot of innovation to happen. I’m seeing the results of that innovation come back into things like how enterprises are building software, how they’re building their continuous delivery pipelines, how they integrate automation into that continuous integration, continuous delivery pipeline.
[00:37:40.10] With OpenStack, we started with a base set of tooling around version control, code reviews and test automation. Over the years, this has gotten very sophisticated. You have your continuous integration piece, you’ve got your continuous testing piece, you’ve got your continuous deployment pieces, and the tooling around that has actually grown up out of other open source projects like Jenkins, as well.
[00:38:21.04] The original team that was building the tools for how we managed our source code, how we built it, how we did continuous delivery worked for me at Rackspace, and also worked for me at Hewlett-Packard. Those guys did a tremendous job for the open source community. Through that type of collaborative effort, a project like OpenStack can be as effective as it is. It’s got a good set of tooling and a good set of processes. Everybody knows what the rules are, and if you don’t play by the rules, you just can’t affect the community; the community says, “No, thank you. We don’t want people that don’t play by the rules.”
[00:39:06.17] Then there is a very high level of communication and the ability to get your voice heard. These are not static, written on the tablets, come down from the mountain processes. This is a living, breathing, organic process that’s always being reviewed. Anybody can make suggestions, anybody can join the conversation, and it makes better software.
Jeff Meyerson: [00:39:39.29] Speaking of processes that make better software, testing is very important to OpenStack development. Tell me about how testing occurs, what the best practices are for some open source committer who wants to commit something and he has to commit tests associated with it, and maybe some of the challenges that are associated with testing a big distributed system like OpenStack.
John Purrier: [00:40:12.09] Yes, it’s a great topic. As you’ve mentioned, you have to start off with a policy and a process, a set of rules that says, “When you check in code, you check in test with it.” You will not pass a code review if you don’t have tests. How you can actually validate is a much more interesting question. In some ways, the distributed, continuous integration process really helps you.
[00:40:46.05] You can set up a [unintelligible 00:40:47.27] and this was how the original Jenkins system was set up for OpenStack. You have a master, but then you’ve got these slave CI systems that can be tuned for particular scenarios. For instance at Hewlett-Packard, when we were standing up OpenStack we had a particular data center topology. We obviously were using HP gear, HP networks, HP racks etc., so what our topology looked like was not the same as anybody else’s in the world.
[00:41:34.00] It was in our own best interest to set up a slave integration server so that with each check-in that came into trunk for OpenStack, it kicked off the continuous integration process, it would go through the basic unit test, but it would also then farm our the work to the slave CI servers (including ourselves) and we would run through the testing on our particular configuration. We then essentially gave a vote back to the CI server, saying “Yay, you’re an A!”
[00:42:12.13] Then, of course, it’s a policy decision, if you have certain slaves failing, whether or not that blocks the merge with trunk or not. That’s very effective, and it also works really well inside enterprises that might have different business units or departments. We used it within Hewlett-Packard to stitch together the networking group, the software group and the storage group. Each of those had their own view of what was necessary to pass in the OpenStack components that they were worried about, so we set up different CI servers that would vote back and pass the information upstream.
Jeff Meyerson: [00:43:05.10] We have been talking about the development of OpenStack itself, and since we’re drawing to the end of the conversation, I do want to zoom out a little bit more and talk about the usage of OpenStack. With OpenStack’s abstraction of a shared pool of resources, the developer gets this set of API’s exposed, that gives the developer access to compute, networking and storage, and I’d love to discuss in more detail what the experience is like for the developer, once these set of API’s is exposed, and how can we leverage these API’s to build our systems?
John Purrier: [00:43:59.04] That’s the whole point of this. The point of OpenStack isn’t to build a system to be building the system, although sometimes we lose sight of that when you’re inside the project. The point of the system is to actually create an environment so that people can write applications.
The API’s that you’re talking about that get exposed, even the platform services that get exposed are all part of the tapestry that a developer will look at. At the end of the day, the point around cloud, cloud deployments and platform as a service is all around making that developer experience more seamless, and really allow the developer to worry about the logic of his system, rather than, “Gee, have I set up the web server correctly? What are the credentials for the database?”, those sorts of things.
Jeff Meyerson: [00:45:14.15] There’s one more technical topic that we haven’t really dove into. We’ve mentioned earlier the scheduler – what is the scheduler, and why is this relevant to the topic of OpenStack?
John Purrier: [00:45:35.06] The scheduler is the heart of OpenStack. It’s the heart of any kind of distributed system, whether you’re talking about grid, or high-performance computing systems, whether you’re talking about containerized systems with schedulers like Kubernetes or Mesos. A scheduler is an orchestrator that says, “Hey, I’ve been asked to deploy a certain workload. Where should I put it?” To some degree, it’s like “Am I going to deploy one instance? Am I going to deploy multiple instances? If I deploy multiple instances, what are the rules for affinity, non-affinity and things like that?”
[00:46:27.05] It really is the heart of the compute infrastructure. The scheduler is the thing that says, “Here’s the request for running this particular program. Where am I going to put it in the fabric?” Typically, there’ll be a feedback loop, there’ll be a monitoring loop, a health check (or whatever you want to call it) so that if in fact I, as a scheduler, deployed this particular workload, I’ll know if it’s in trouble, if it’s died. And typically, within cloud systems, all of the infrastructure is immutable. So instead of trying to fix something that’s broken, we’ll just shoot it in the head and deploy another one.
[00:47:19.07] Earlier, we talked about the fact that we can’t count on virtual machines. They are ephemeral, you can’t count on them being there all the time, and part of the reason is this – if we have to take this one out and put another one in its place, hook it up to a load balancer, we’ll do that.
The other scenario that comes into play here is workload motion. For instance, if I have a particular server and I need to bring that server down, the first thing that I have to do is evacuate all the virtual machines and all the applications that are running on that, to other parts of the system. The notification will come back to the scheduler, saying “Hey, I’ve got 27 instances of applications running here that you need to find another home for.”
[00:48:16.26] The scheduler will reschedule those, which allows us then to shut down the virtual machines and applications on the server that’s going down for maintenance.
Jeff Meyerson: [00:48:27.27] I’d love to close by zooming out further and projecting towards the future. We’ve got all these different ways that we can do cloud computing at this point. We’ve got AWS, Azure, DigitalOcean, Rackspace – anything using OpenStack. How are these different platforms going to evolve over time, and how should developers assess this situation?
John Purrier: [00:49:03.18] That’s a really great question. What we’ve seen over the last few years is obviously a great expansion in some of the market leaders (Amazon, Microsoft Azure). We’ve seen the guys at Google do some really great work. They’re starting to turn their attention toward enterprise, which is new for them, and over time what we’re going to see is a couple of things.
[00:49:33.20] We will see continued competition amongst the big players. The players that have cash in the bank, they want to control, they want to build data centers, they want to be in the real estate power cooling business, so we’ll see continued expansion from Microsoft, Amazon, Apple and others.
The second thing that we’re going to see is competition in the vertical cloud space. As a cloud service provider, it’s going to be very difficult for me to really compete in this highly competitive, price sensitive commodity business. But if I bring value, if I build a cloud that is targeted at a particular vertical, whether it be government-oriented, whether it be healthcare-oriented – whatever allows me to provide unique value for particular segments of the market, I think we will start seeing those verticals, as well.
[00:50:45.29] Ultimately, with all this commoditization, the API’s over the top of them – the Amazon API, the OpenStack API over Rackspace’s cloud – I believe that it’s going to be subsumed. Just like today when I put my workload into a cloud I don’t know where it’s running, I think in the future I’ll be able to put a workload into the cloud fabric, and the fabric will be able – through heuristics, algorithms, the evaluation of a lot of data – to place a workload into an Amazon, Microsoft or private data center, depending upon the business policies that have been defined. My personal view – this is what we’re working on Automic – is to create that world.
Jeff Meyerson: [00:51:50.25] It sounds really interesting. We’re up against time and I don’t want to ask a big question… But I’ll just ask a big question: do you think it’s going to be a polyglot cloud world where I’m a big company and I’ve got my media cloud that runs on maybe Netflix data, and I’ve got my machine learning running on Google Data Center (because Google is great at machine learning Platform as a Service), and maybe I’ve got storage on Amazon because Amazon is best at that – are we going towards a polyglot cloud world?
John Purrier: [00:52:33.00] I believe that we’re going towards that world, and it will be driven by business policies and by the features and configuration of the various clouds. When I say, “I’ve got this particular workload and it has these characteristics – maybe one of the characteristics is machine learning”, the system will say, “Oh, I know that this should go toward Google, rather than Amazon.” Or “This is my dev test environment. Find me the cheapest place out there.” Maybe it will end up on a DigitalOcean server someplace, right?
[00:53:09.22] I do believe that that is where the world is going. I’m very excited about heuristic and algorithmic automation. It’s really the third leg on the stool. The three legs are traditional automation that we have today – application release automation and data center automation. That’s been going on for several years. It started as runbook automation and workload automation; now we’re doing a lot of DevOps things, and Automic places in that space.
[00:53:45.24] The second leg is the cloud-native automation really coming out of the PaaS world. If you take a look at Cloud Foundry, OpenShift, Heroku – they have a really rich ecosystem around the developers, and they have opinionated automation for the operators. I see that those two worlds are going to be coming together sooner rather than later.
[00:54:14.15] Of course, the third leg is that heuristic or algorithmic automation, where it’s non-deterministic, it’s driven by policies – you basically tell the system what you want, and the system is getting a load of data (call it internet of things) from sensors, telemetry data from servers and networks, and is then able to say “Hey, for this particular workload let’s put it here (maybe Google), or let’s put it in the Cleveland, Ohio private data center that we have”, depending upon the characteristics of the workload and the availability of the resources.
Jeff Meyerson: [00:54:55.10] That’s a great place to close off. Where can people find out more about you, John?
John Purrier: [00:55:00.22] I would go to www.automic.com. That’s our website. I’m also on Twitter, @johnpur.
Jeff Meyerson: [00:55:10.19] Great. We’ll put both of those in the show notes. John, thanks for coming on the Software Engineering Radio, and giving up some of your time to talk about OpenStack. This has been great!
John Purrier: [00:55:20.01] I’ve had a great time, thanks for having me on.