Recording venue: ThoughtWorks North Europe Away Day, Sherwood Forest, Newark, Nottinghamshire, UK
Johannes Thönes talks to James Lewis, principal consultant at ThoughtWorks, about microservices. They discuss microservices’ recent popularity, architectural styles, deployment, size, technical decisions, and consumer-driven contracts. They also compare microservices to service-oriented architecture and wrap up the episode by talking about key figures in the microservice community and standing on the shoulders of giants.
- Adrian Cockcroft (Groupon), Migrating to Microservices, QCon London 2014: http://www.infoq.com/presentations/migration-cloud-native
- Martin Fowler and Jim Webber (ThoughtWorks), Does My Bus Look Big in This?, QCon London 2008: http://www.infoq.com/presentations/soa-without-esb
- Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation by Jez Humble and David Farley
- Release It: Design and Deploy Production-Ready Software by Michael Nygard
- Hysterix (Circuit Breaker in Java): https://github.com/Netflix/Hystrix
- Conways Law–Paper: “How Do Committees Invent?”
- Inverse Conway Maneuver
- Pact: https://github.com/realestate-com-au/pact
- Pacto: http://thoughtworks.github.io/pacto/
- Microservices by James Lewis and Martin Fowler
- James Lewis Blog
- Building Microservices by Sam Newman
- The Netflix Tech Blog
- James Lewis on twitter: @boicy
- jalewis <at> thoughtworks <dot> com
- GOTO Berlin 2014 Conference
Male: This is Software Engineering Radio, the podcast for professional developers on the web at SE-Radio.net. SE-Radio brings you relevant and detailed discussions of software engineering topics at least once a month. SE-Radio is brought to you by I Tripoli Software Magazine online at Computer.org/software.
Johannes Thönes: Welcome to another show of SE-Radio. It’s Johannes Thönes this time and I’m sitting here with James Lewis and I’m starting by reading his biography.
James Lewis is a principle consultant at Softworks and calls himself a coding architect. He is part of the technical advisory board which meets quarterly to produce the ThoughtWorks technology rater.
James’ interest is building application out of small calibrating services, stems from a background in integrating enterprise systems at scale.
James studied astrophysics in the 90s, but got sick of programming in fortune. Fifteen years of DBA, software engineering, design and architecture later, he believes that writing the software is the easy part of the problem. Most of the time it’s about getting people’s thinking right.
So, welcome, James on the show.
James Lewis: Thank you very much.
Johannes Thönes: So is there you would like to add to your biography?
James Lewis: I don’t think so. I think that pretty much covered it. I’ve been at Softworks about eight and a half years now. So I’ve seen a lot of I think changes in the software industry in that time. Everything’s moving very, very fast.
So I guess what we’re gonna talk about today is one of those latest changes.
Johannes Thönes: Yeah. So let’s just dive into the topic. We wanna talk about micro services. Can you maybe start by just giving us an idea of what a micro service is?
James Lewis: Yes, of course. So a micro service in my mind is a small application that can be deployed independently, scaled independently that can be tested independently and which has a single responsibility.
So, this is on multiple axes. A single responsibility in the original sense that it’s got a single reason to change and/or a single reason to be replaced. That’s necessary.
But the other axis is a single responsibility in the sense that it only does one thing and one thing alone and can be easily understood and is understandable by the people building and developing that _______.
Johannes Thönes: What would be such a single thing be?
James Lewis: That’s a good question. An example of a single thing, while it might be a single responsibility in terms of a functional requirement or it might be in terms of a non-functional, as we’ve started talking about them, cross-functional requirements.
An example might be a queue processor. So something that’s reading something reading a message of queue, performing a small piece of business logic and then passing it on. That might be something that’s maybe more cross-functional or non-functional thing or it might be something that has the responsibility say for serving a particular resource or resource representation.
Johannes Thönes: Like a user.
James Lewis: Like a user or say an article or it might be a risk in insurance or something like this, but something that’s very, very focused, very, very small and performs a single task on its own.
Johannes Thönes: So I have the impression that micro services became quite popular in recent time. You have been talking about it. Other people have been talking about it. Why do you think is that?
James Lewis: That’s a good question. Why have micro services suddenly seemed to have captured everyone’s imagination.
For me the journey starts a little while ago around four years ago where I put a lot of time at some workshops with people from various different communities and software industry. Some of the people involved in the RESC community, the messaging community and so on.
At these workshops a lot of the questions that kept reoccurring if you like were around the size of applications or seemed to me around the size of applications.
Examples of that would be we’ve got this big application. It’s been growing for two and a half years or five years or ten years, but we can’t maintain it anymore. It’s just too difficult to actually make any functional changes to it. Or we need to take this application we’ve got and somehow be able to deploy this into the cloud. Software as a service say, but at the moment it’s just impossible to do that.
So as a result of that, the ideas of starting to split up applications into smaller cooperating components that are running out of process and talking to one another, which can be maintained separately, scaled separately, thrown away if you need to as sort of evolve, but there’s been this community of practice almost that’s been growing. Some of it based in London with people who incidentally a lot of the people I’ve worked with. So Dan North talks about replaceable component architecture. He stresses the idea of replaceability in these systems.
Brett George has also been talking about micro services for a couple of years now. There’s this almost London community of practice around this.
At the same time, of course, you’ve got the Netflix’s on the West Coast of the U.S. who’ve been pioneering what they were talking about in terms of fine grained service oriented architecture. In fact, Netflix are now talking in the same terms and using the same term micro services.
There’s been a number of these different communities that have been growing over time and demonstrating that this approach to building software is a viable one for production and almost when you start to look at companies of the scale of Netflix, then almost like a necessity as they grow income. Edgemont Kotcroft is on record as saying the reason they build systems in this way is because they wanna be able to go as fast as possible. They wanna be able to make changes as fast as possible.
Bringing it back to the original question about why is it so popular now, well I think there’s a lot of organizations that have built up some kind of debt over the last – I don’t know – five years. They’ve realized that in order to scale more, in order to be more effective when it comes to delivering software into production, in order to take advantage of things like continuous delivery, they need some approach which allows them to do this scale along different axes independently.
I think it’s about the right time for an idea like micro services to take off because there’s a lot of companies facing the same problems.
Johannes Thönes: You said something interesting about that people coming there with their large monolithic applications and split them up into micro services. Is there a typical form of introducing micro services?
James Lewis: That’s a great question and one I’ve actually been struggling with. It goes right to the heart of the question do you start with micro services or do you refract the two of them later.
Empirically most of the organizations have actually started with something big and have split the big things up over time. Most of the organizations that are building what I would say a micro service style implementation. So you think about the Netflix. The canonical probably example of a lot of the patterns would be Amazon. So you think about Amazon where they started with a big database and then moving towards service oriented architecture.
Netflix from a very similar position. They had a fairly big monolithic structure, monolithic application and they split it out, moved towards this fine grained service oriented architecture.
Johannes Thönes: Do you know an example where you can talk a little bit more about to visualize a little bit how such a split up of a micro service architecture happens or interim micro service architecture?
James Lewis: Absolutely. I’m a good consultant so I won’t name clients, but there’s an organization in the insurance world who I got involved with about three and a half years ago now. They were in the classic situation where they built this really big application. It was a typical NTS style architecture where they had a big database, lots of logic in the database. They had some data services. It’s the dot net world so the kind of data access service layer over the top. Then above that they had another big application which contains a lot of the business logic.
The problem they had was this big application sitting at the top in the service tier. They included a lot of different products. So say they sold six months of insurance. They’re a very distinct product life. Each of them have distinct change cycles. Each of them are more mature or less mature and as the organization was growing, they wanted to be able to invest in growing different parts of these products or different products independently.
Unfortunately because they had this very big application everything was tied together. So if they wanted to make changes say to their home insurance products, they had to wait on the change cycles for the motor insurance product or if they wanted to scale the motor insurance product, typically motor insurance is very, very high volume compared to others, say life insurance, they had to scale to the level of the motor insurance product.
They didn’t really have much in the way of flexibility. They also had ended up in a situation where they had to introduce byzantine or overly complex branching and merging strategies to enable the teams of developers that they had working on this application that work as effectively as they could without standing on each other’s toes because, of course, if you’ve got one thing, every time you make a change it potentially impacts lots of different areas –
Johannes Thönes: Everything else.
James Lewis: So they were in a pretty tricky position. Over the last three years they’ve been moving towards splitting out this application by product. So this is an example of how you might split applications out.
Johannes Thönes: Splitting out was one small part, which was then still top of the monolithic application?
James Lewis: In this case actually they didn’t. In this case what they did, so rather than do that and strangle it off gradually, they are strangling it, but by building a new product off to the side independently. They’re not refactoring towards it per se, but they’ve started with, okay, we need to now develop the home insurance part. They we’re gonna develop that as a separate web app, separate micro service.
In fact, internally it was a CQRS implementation.
Johannes Thönes: Which is?
James Lewis: In the dot net world, it’s the command re-responsibility segregation pattern.
Johannes Thönes: Can you shortly explain what that means?
James Lewis: I can shortly explain that, yes. So in this world it’s the idea of separating read from write, but also implementing data access and storage via either the execution of commands or queries to a separate read store.
Now typically what you have is an event store that you ______ ______ on events get generated, which populates the read model and then your applications read from the read models. You write into one place and you read from somewhere else.
These components that were doing that, they were developed independently. They’re running as separate services as micro services. Then you have the application, again, which is another one. So that would be a specific example of doing that.
I know Group on did a great talk. I can’t remember the guy’s name, but there was a great talk by one of the architects at Group on at U Conn, March 2014 in London.
Johannes Thönes: We can find that out and put it to the show note.
James Lewis: About how they’d approached it. They’ve followed a different approach. They’re actually splitting up – I presume this is still true – that they’re actually splitting up an application on a page-by-page basis. They have a micro service that is responsible for serving parts of a single site rather than –
Johannes Thönes: Then connecting it to the old site.
James Lewis: Yes. In their case they are talking back to their Ruby. I believe it’s Rails based APR.
Johannes Thönes: Let’s talk a little bit more about how you do technically a micro service. When I’m gonna build a micro service for let’s say, the user authentication, what languages would I use? What standards do I build upon and what do I need to make happen?
James Lewis: One of the whole guiding I guess principles behind this is that actually you get the freedom to choose a lot of the tooling on a case-by-case basis. So rather than it being you choose a particular language or particular backend data store say for your entire product stack, for the entire product, you get the flexibility to make informed decisions based on the right ______ for the situation at hand.
The key thing I think is to make the stack lightweight. So rather than looking to use really, really heavy, traditional stacks involving deploying into application containers, ______ and Tom Cat and these big containers, to think about if you can use lightweight alternatives. So things like embedded Jetti, embedded Tom Cat, Simple Web or Web It is another tool in Java land.
Dot net land is an interesting place at the moment because obviously traditionally what we’ve done in the dot net world is we’ve deployed into RIS. We’ve deployed all of our applications into this managed environment, but even in the dot net world there’s been a movement to bring in some of their learnings from the UNIX and Java communities around using embedded service.
So for example, we’re seeing more projects going on there that are using a non-CFX as an alternative to some of their web API or MBC frameworks and then using things like Owen. So there are alternatives out there in the dot net world where you can start to use more lightweight tooling around standing these applications up.
Johannes Thönes: But when looking at the communication layer, everything is HADP and Res.
James Lewis: I’m a self-confessed fan of the Restal architectural style. I think it solves a lot of the problems that we’ve come to recognize when we’ve been doing integration over the last couple decades –
Johannes Thönes: For instance?
James Lewis: Well, for instance, I’d say things like versioning of resources and things you can address more easily if you’re using things like resource representations and so forth.
So saying that, there is nothing in the micro services world that says thou shalt be a restafarian and thou shalt always use rest. What I normally say to people is you should choose the right tooling for the job. If the right tooling for the job is to use message pass is to use message passing via bus because you’ve got specific cross-functional requirements around that, then that’s what you should do.
Martin and myself in the art school we write called us out as a focus on smart endpoints and dumb pipes over a focus on having a big service bus, traditional enterprising service bus where you would hang services off the service bus.
This is more about having these micro services with the smart and the business logic in the micro services, which are talking via application protocol, within application protocol or interacting ______ to main application protocol with other micro services, but the smarts stay in the micro services themselves. That’s where the business logic is.
Johannes Thönes: So this smart endpoint and dump network is the reference to the UNIX model I think.
James Lewis: I think it could be read like that. I think the reason we chose that name was more around the enterprise service bus model. Certainly inside _______ for as long as I can remember there’s been a tendency to mistrust or distrust rather heady iron, big iron when it comes to doing an integration.
These sorts of things, they offer to solve all your problems. They’re big enterprise service bus products. They make lots of promises about solving all your problems. Personally I’ve not seen an implementation of a “service oriented architecture” with everything hanging off a central, big enterprise service bus. I’ve not seen one of these things succeed.
So for us I think it’s trying to recognize that kind of centralization model where you’re putting all your logic in one place and potentially it ends up being _______ service bus, which takes care of everything for you or certainly takes care of all these rooting and data transformation and all these sorts of things you have to do to get _______ applications talking.
Relying on one of these things to solve all your problems is in my mind, not the right approach. There’s a great talk by Jim Weber and Martin Fowler called Does My Bus Look Big in This? Which they did as a keynote at U Conn some years ago. In that talk Jim talks about the idea of the agrarious spaghetti box. So the enterprise service bus as the panacea for all your wills.
His line on that is it just makes your diagrams look nice because, of course, you have these diagrams. You look at your enterprise architecture and they’ve got all these crossing ugly lines and it’s really tempting to put this ESB box in the middle ‘cause suddenly all your lines are straight. That’s a great thing if you’re an architect.
But of course all the lines are still there. They’re just in the middle of a spaghetti box. It just looks like a spaghetti box.
Johannes Thönes: But when all the routing is not done by the enterprise service bus, who does the routing? So do I need to do the routing?
James Lewis: You certainly need to understand more about how your applications communicate with one another, yes. Of course if you’re building more services in some ways every problem or you end up with many more integration level problems because when in the past we might have been unlucky and be talking to two or three external systems say.
Now when you talk to one of your own systems there’s also an integration problem. Though certainly you have to be much more cognizant of that. You have to understand much more about how applications communicate.
Saying that, there are definitely approaches to doing that. The things like event sourcing or event driven applications where you’re doing Pub/Sub, either using messaging or using HTTP and resource representation say, allow you to decouple so you’re not just doing point-to-point RPC the whole time.
Johannes Thönes: Isn’t that a bit like putting the complexity now from the monolith and how it works together in the networking layer?
James Lewis: The short answer to that is yes. Actually when I originally talk to people about this one of the great comments I got back was, again from Martin Fowler, was that we are shifting the accidental complexity in the sense of the Fred Brooks, the use of the term, from inside our application from the Glue Code that our components and modules within our application, that Glue Code, the library code that they used to talk to one another and stitch themselves together, we’re moving the complexity from that out into your infrastructure.
I think that’s probably why this is, going back to one of the first questions, this is probably one of the reasons now is a good time for this because we’ve got many more ways to manage that complexity now. If you look at programmable infrastructure, infrastructure automation, the movement to the cloud, the cloud being ubiquitous, those sorts of problems, the problems of understanding how many applications we have, are they talking to one another, we’ve got better tools to address these things now.
Johannes Thönes: One of the things I noted from listening to one of your talks about micro services is that you said micro services need to be independently deployable without a container. So what does it mean and especially what does it mean for infrastructure automation?
James Lewis: I have said that in the past. As a lot of these things are, it’s more nuance than you tend to be able to claim in a 45-minute talk.
The way I think about this is you should be able to independently deploy the things that you need to independently deploy for the reasons that I was talking about earlier. Do you need to scale them independently? Do you need to, either for availability or through latency reasons and so forth, if you have those requirements, then you should be able to deploy them independently so that you can scale them.
However, I do know organizations that deploy groups of these things in lockstep in some cases. So, maybe there are three or four things that really you quite often share the same change cycles that they actually upload at the same time –
Johannes Thönes: Like the user service and the authentication service –
James Lewis: Potentially that might be one, yeah. So in that case, I bet the insurance company I was talking about, that the read and write services were actually pushed out as part of the same release of that product then.
In terms of what the impact of that is for our operations teams and for things like building deployment infrastructure, we need to be getting good at building software. We need to be better at building software. We need to be focusing on the ideas around continuous delivery from Jez and Dave’s book. Thinking about how we’ve got our build pipelines together. So this is how we move our software from developer workstation all the way through into production.
Also, in a lot of ways, this brings into this shift around thinking about mean time to recover in production rather than the mean time to failure, which is the traditional approach, managing system.
Johannes Thönes: So the book you mentioned is the continuous delivery book by Jez and by Dave?
James Lewis: Yes, Jez Humble and Dave Farley.
Johannes Thönes: So what do you mean by discovery by failure?
James Lewis: So moving on towards this mean time to recovery.
Johannes Thönes: Yeah.
James Lewis: There’s some really interesting examples out there. I might be kind of after Netflix I’m afraid, but a great example of this is how Netflix used circuits breakers in production. So the circuit breaker pattern’s been around for quite a while. Obviously it was either invented or certainly very popularized by Michael Nygard in his book, Release It, which incidentally, is fantastic. One of the best books out there on operations.
What a circuit breaker in your software does is provides you a level of safety when you’re talking to external systems. So in my code I’ll implement a circuit breaker such that if the downstream system that I’m integrating with becomes slow or goes away and, of course, as we all know, there is essentially no difference between being slow and not being there at all.
These pieces of software, the circuit breakers, can trip and open and then alert people that the downstream systems aren’t available anymore. They also provide additional protection depending on the implementation. So things like avoiding the thread pull starvation and not blocking incoming requests and so forth.
Johannes Thönes: So when I got it rightly, when in Gmail I’m offline and say I will try back in 5 seconds and then it goes up to 10 seconds, then it goes up to 20 seconds, that’s an implementation of a circuit breaker.
James Lewis: That’s exactly. That’s a perfect example, yeah. So what that gives you, then when you start to do things like monitor reporting on those is the ability in production to see exactly what’s happening in your system at any one given point.
Of course, if you’ve got, say, ten collaborating services, that starts to be much more important because it’s very difficult to know if you’ve got ten collaborating services if a slowdown occurs, if response times to your users start to slow, where that problem is.
So doing things like instrumenting your code with ELOT circuit breakers and using the circuit breaker pattern and then reporting on it allows us to focus our attention on problems immediately as they happen and then to focus on fixing those as quickly as we possibly can.
This idea of mean time to recovery, whether that’s restarting service, whether it’s bouncing boxes, bringing new boxes up or whatever that may be, is starting to become much more important.
I don’t think it’s a coincidence that probably the most popular implementation now is something called History, also by Netflix and Netflix have got 600 plus individual services in production. When you’ve got a landscape that’s that operationally complex, then you need to do some things that, A, diagnoses your problems in production and to make lives easier for the people who are operating that environment.
Johannes Thönes: Maybe changing topic a little bit, but building up on what you said. When they have a couple of hundred services, how big are these services? How big is a micro service?
James Lewis: That’s something we’ve been talking about internally for quite a while. I’ve seen them ranging from a couple of hundred lines of code up to a couple of thousand lines of code.
For me I think –
Johannes Thönes: That’s certainly not a million.
James Lewis: It’s certainly not a million lines of code, no. The guidance I’ve been giving people and the way I think about it personally is it doesn’t do one thing and one thing only. It’s difficult to imagine something that had a million lines of code doing one thing and one thing only unless all it was doing was printing out a million lines of code of itself or something. I don’t know. Which might be possible. Is that _______. I think it is.
So in terms of the size of these things, the guidance is you should be able to understand them. They should have a single reason to change and they probably shouldn’t be more than a couple thousand lines of code.
Saying that, when you get to that point, the number of them becomes important. It’s probably more important to think about how many of them that you’re capable of supporting operationally than it is to think about how small they actually are because it’s better to have slightly bigger ones and fewer of them if you’re potentially operationally a bit more immature in the sense of you don’t have fully automated deployment into production. Push a button. You don’t have all the monitoring in place and so on.
You have ______ you’re still learning and acquiring the skills necessary to work in the cloud and these sorts of things. In that sort of situation it’s probably better to focus on the number of them you have rather than their individual sizes if that makes sense.
Johannes Thönes: I’m wondering a bit because you’re talking a lot about single responsibility. You mentioned domain driven design in the beginning. Is micro services a little bit domain driven design on the service label?
James Lewis: If you’ve seen my talks online, I know you have. So I often end the talks I give with this idea that standing on the shoulders of a giant. For me, I think micro services, it’s a coming together of a bunch of better practices from a number of different communities.
So it’s great stuff from the domain driven design community around strategic design, banded context, subdomain, how to separate out your domains, how to partition a very big problem domain into smaller domains that you can manage them.
It’s also taking a bunch of the better practices from things like the operational automation and programmable infrastructure, those communities, the dev ops communities, the cloud communities, the integration communities and I’m gonna lump together a bunch of people who’ve been working with messaging, who’ve been building RESTful services. You’ve been working very hard to make people aware that you can solve integration problems using just the tooling available for free that drives the web without having to invest in these big iron pieces of _______.
It’s a bringing together of all of these different ideas into one place. So, from the domain driven design community, if I maybe roll back a few steps because we’re talking at the moment about micro services in isolation, for me, the way you do “architecture”, it has to be driven from the business in the business context.
It has to be a pop down thing where you’re understanding what the business problems are, you’re understanding what the business landscape looks like, you’re understanding what the business processes are and then driving techno purpose software product _______ underneath that.
For me that’s at the heart of domain driven design understanding what the business contexts are, understanding what the business domains are.
One of my colleagues in Cartwright uses the great ______ potentially. It comes across as very technical. The idea of business and architecture isomorphism. So this is the idea that actually if you look at your business and you look at the design of your systems, they should be very similar.
You should be able to look at either business and see your IT systems, your architecture, as far as architecture, you should be able to look at your architecture and see your business. If you’re a technologist or a business person there should be a recognition both ways that this is going on.
There is a rather more controversial extension to this which is that potentially we shouldn’t have any IT at all and that everyone should sit in with the business, but I won’t maybe go into that. That’s probably another hour-long conversation.
Johannes Thönes: When you’re talking about business, I remember you mentioned something about the neighborhood of micro services. I wonder if that’s a good way to group them.
James Lewis: The neighborhood?
Johannes Thönes: You said something like if you want to group micro services then you should group them either around business or around non-functional requirement and you called that a neighborhood of micro services.
James Lewis: Maybe I’ve coined a new word today. This is so exciting that I’ve almost forgotten I’ve said that. Apologies.
Yes. I think so. We’ve been talking a lot internally and, again, standing on the shoulders of giants, liberally borrowing from colleagues building on their ideas.
There’s this idea of town planning as the ________ you’ve come across this. This is the idea that actually your design of your system, now rather than being the blueprint for ______ building, it’s a bit more like a town plan. It’s a bit more about the zones in a town. You have light industrial. You have commercial. You have heavy industrial and so on; residential.
Then the connections between these zones are the kind of shared domain application protocols. The shared utilities. They all use the same water pipes. They all use the same electricity. They use the same street lighting and so on.
So the trick effectively is how to create the map of the zones. That’s the trick. I actually favor a fairly light approach to doing that based on things like understanding business processes and then using lightweight techniques, like think of white boarding and using index cards with capability names written on them and things like that and grouping this sort of stuff together. Seeing if it makes sense.
But fundamentally yeah, understanding what those groupings are is crucially important. That has to start at the top. This is where there’s massive overlap and we’re learning from the traditional, more established service oriented architecture community who’ve been doing great work in this space for a long period of time.
So using all those lessons. How to subdivide things. Using the domain driven design ideas of _______ compact and subdomains. Putting that together and then decomposing them into these individual applications.
Johannes Thönes: Maybe going a little bit into that direction, I remember a colleague of mine saying, “Yeah, we are talking about micro services because the term service oriented architecture is burned.” Can you contrast what we mean by micro services and what actually burned the service oriented architecture so that we are now talking about micro services?
James Lewis: So, first off, I’d like to make a distinction between the service oriented architecture as a community as a whole and the implementation as I have experienced it in lots of different organizations because I think there is a distinct difference.
If you actually look back at the work people have been doing in SOA, there’s a service oriented architecture manifesto, for example. It’s actually pretty clear. It’s actually pretty sensible. I don’t think anyone can actually take much in the way of _______ then, but then if you contrast that with the implementation that I think is, well certainly my personal experience, and also the experience of many of my colleagues and of industry, I think in general there’s a massive difference between the ideal of service oriented architecture and the reality on the ground.
The reality on the ground is a 10-year $25 million program. That runs for two and a half years and gets canned that delivers no value. The reality is spending millions of dollars on pieces of kit that will solve all your integration problems and then those implementations never delivering.
So I think it’s really important to make that distinction between what the SOA community stands for and actually what the reality is.
So I think the micro services idea is really about that reality, which is why I think it’s no surprise I think it’s been a community-led or it’s been a bottom up approach to building systems. A lot of the people involved have been or still are heavily involved as developers on teams who see the real problems day-to-day with either the amount of time it takes to make change or how difficult it is to scale things and have been looking for an approach, which I’ve been looking for an approach that addresses some of these problems.
I also don’t think it’s any coincidence that a lot of these individuals have been heavily involved in the evolutionary architecture emergent design and XP and Agile movement because what we’re really talking about with the micro services model is a more iterative, incremental approach to building out service oriented architect or building out systems.
Even that isn’t new because you can look back to Jim Weber talking several years ago now when he was talking about Gorilla SOA as an approach to take to building services in your organization where you build out incrementally and release value incrementally as you go rather than designing your entire service landscape up front, spending eight months on a design document planning over the wall to an implementation team. It’ll take five years and actually not deliver anything.
So I think for me the micro services style of building applications is intrinsically related to the service oriented architect because the service oriented way of building systems is an eminently sensible way of _______ _______ building systems.
I think the main differences for me are in the implementation details. So the decentralized governor’s models that these ______ —
Johannes Thönes: So no enterprise service bus.
James Lewis: Yes. So there’s the technology choices. So favoring the lightweight solutions over heavy, but also in terms of things like governance. So favoring lightweight mechanisms for governance, both technical and human. So rather than favoring many, many standards and standards bodies, favoring building up these standards internally over a period of time based on real use and real experience or rather than have a top down approach to these sorts of things, thinking about it from a team level and building it up from the bottom.
Johannes Thönes: One thing I also heard is that in contrast to the micro services architecture have a GUI. Is that true?
James Lewis: Can I just say no?
Johannes Thönes: Yes.
James Lewis: No. I don’t think that’s the case at all. I don’t think there is anything that implies that you should have a GUI. I think actually it’s quite useful to have a GUI, even if it’s just telling you how your application is performing, but in terms of a user customer facing –
Johannes Thönes: So from ________ ==
James Lewis: Yeah, exactly. But in terms of a customer facing GUI, no. There’s nothing that says thou shalt have a GUI in a micro service.
Johannes Thönes: Let’s talk a little bit about team. I guess you know Conway’s law. Can you talk a little bit about it and what it means for micro service architectures?
James Lewis: So this is another example where a practice or something that’s been evident for quite a while has come together, been laid on top of other practices to be something that I think is greater than some of the parts.
So Conway’s law, I think it was 1968, Melvin Conway. He’s on the West Coast of the U.S. He submitted a paper about his experiences with software and I think his communication theory.
What he described in the paper that he wrote was that the communication pathways of the software that an organization builds exactly matches or closely matches the structure of the communication pathways within the organization itself.
So if you’ve got a team of database administrators, a team of middle ware people and a team of UI people, what you end up building is a UI, some middle ware and a database.
I thought I could attribute this quote to Dan North, but I asked him and he said it wasn’t him. It was someone else. I’ll sort of claim it was him. He said, “If you asked nine people to write a compiler, you end up with a nine pass compiler.” Which is another way of looking at it.
Interestingly that paper was turned down originally for publication I think by Harvard Business Review. Then over a period of time, other organizations repeated some of the research. I think Microsoft did some internal research and they agreed to it. Then HBR, Harvard Business Review, finally repeated the research themselves and they discovered that this was something that seemed to happen in software.
So it was eventually validated and has become known as Conway’s Law. The interesting thing is, of course, and what we’ve been calling at our technical advisory board meetings, the inverse Conway maneuver, is it has implications not just for how you structure your _______ and how you design your systems, but it has implications for actually how you reinforce boundaries between these things.
So you can actually use team structures to impose boundaries in your software or reinforce architectural boundaries.
Johannes Thönes: So to put it clearer, what you’re saying is one micro service should be full time developed by exactly one team.
James Lewis: I’m glad you asked for the clarification. No, I’m not exactly saying that. I’m not suggesting that there should be one-to-one mapping between micro service team. It might be a mapping between three micro services and a team. If those three micro services all live within the same banded compact, within the same domain because my favor that’s team boundaries being the more stable business capability boundaries. Those tend to be fairly stable in organizations. This goes back to this business and architecture isomorphism idea.
We should be looking at how the business works and structuring. This self-isolator looks like that. Similarly if you’re structuring teams around that, then they can be cross-functional teams that look like the business model.
It’s a really interesting area that I think people are quite excited about exploring at the moment. The best example I’ve seen of it recently was a client. I’ve been asked in to do some architecture consulting. They have a big systems estate. They have a big portion of their software that’s been written in India. They have a portion of their software that’s been written in London.
You actually look at the software, the way the software in London and the software in India communicates and it’s probably the most decoupled of any of their systems. This is a fairly overs system now.
So over time those individual bits have started to become maybe more temporary or _______ coupled, but the actual communication between them is still incredibly decoupled. Basically they’re passing messages back and forth.
That’s a perfect example. The teams were on different continents. Actually maybe that’s something we can think about when we design our system.
Johannes Thönes: So if you wanna decouple your modules or your services, decouple your teams.
James Lewis: Yeah.
Johannes Thönes: Good.
James Lewis: Although the contrary example to that was as another client, I was in the U.S. It was the fairly typical cube model on a campus out of town where they had six people working on a team there on the same probably what would be called some kind of service _______ the same application. Six people all sitting different parts of the office all in their cubes who worked for six months and then came together and integrated towards the end.
I think in that case it was probably not the right thing to do to isolate them ‘cause when they got together it took them another six months to integrate their software. I’m not even sure if it ever ended up in production. Their problems were so horrendous.
Johannes Thönes: So you should still see that your system works as a whole, which is a perfect lead over to the next question I have is how do you test micro services. We know how to do unit testing. We know how to do integration testing on a single service level, but when you have a whole application of a couple of micro services, you want to ensure that they work together as well. So how do I test it? Isn’t the test setup very complex?
James Lewis: Yeah. This is probably one of the more difficult aspects. When we’re building systems you’re always making tradeoffs. Anyone involved in software knows that you make tradeoffs in lots of different directions.
This is one of the tradeoffs that we’re making when we’re deciding the focus on things like scaling and maintainability every time. Testing becomes more difficult.
A lot of our systems _______ also tend to end up using event based or event driven models on top of the micro services, which again, adds another layer of complexity because suddenly you’re not just worried about having to test lots more things in isolation, you’re also having to worry about asynchonicity.
Saying that, you get a lot of benefits as well ‘cause you can test the small things in isolation. You can wrap a lot of tests around that small bit of ______ and actually verify it’s working effectively. It’s doing what it’s supposed to be doing according to its specification.
The problems come when you’ve got behavior that spans multiple systems. So how do you test business processes say across multiple micro services.
An example of that might be I wanna create a user say, but I also wanna create a _______ _______. I’ve thought about this example before ‘cause in my system it’s a banking system. I’ve got a bank account. I’ve got some kind of ______ user and there’s a business process which says when I create the user they might wanna have a bank account associated with them and I wanna be able to send out some emails or some bank cards or something to them at the end of the business process.
Now using the micro service approach, potentially there’s several different business capabilities there. You might have something that’s talking about your customer, customer _______ capability. You might have your transactional, your accounting capability, your account capability. You might then have fulfillment capabilities.
Then you’ve got business processes which cross these boundaries. So how do you actually put tests around these things. There are a number of different ways of doing it. So an example would be to have some kind of product level testing. So I have a product level flow across these different things, which I can execute automatically and that involves having an environment that I can deploy all these things into and actually test against them.
There’s also actually the more advanced or the organizations that are further down the road that are really inventing the new ways of building software. I’m thinking the Twitters and the Netflix’s again and so on. They’re actually moving towards a really interesting new model where actually things like performance testing before you go live, all these sorts of things are actually becoming past tense.
Now it’s we get stuff out there and then we’ll be testing in production. Our tests are actually performed by a small amount of our users in production. Obviously after the actual functionality’s been tested by say the product _______ team that runs those things.
Johannes Thönes: And hopefully supported by good monitoring.
James Lewis: And supported by excellent monitoring and aggregated logging and all the other things that you need to put in place, but this movement towards, as I said earlier, mean time to recovery, the idea of semantic monitoring production. So an actual running through user _______ automatically after you deployed into production and so on. User _______ being the path a user might take a happy path through say our website.
The boundary’s just not ______ be pushed in those by the big organizations.
You didn’t ask me about versioning then and maybe we should talk about that as well.
Johannes Thönes: Yeah? Alright. Then talk about versioning.
James Lewis: Because I think it’s closely related to testing because, of course, if you’ve got lots more of these things talking to one another, then making sure that they can still talk to one another as you keep deploying them is obviously more difficult than a method call. Similarly refractoring across these things is more difficult.
Things like reversioning becomes a much bigger problem than it has been in the past. How do you version the contracts that you’re _______ to do to your clients.
Again, there’s a number of different approaches people are taking. So in some cases, people are using the idea of consumer driven contracts and taking that to the next level.
Johannes Thönes: So what is a consumer driven contract?
James Lewis: Again, it’s an old idea. If I’ve got a downstream system I’m integrating with, what I would normally do as a tech lead on a team, as the dude with technical responsibility is make sure I’ve got a suite of tests that are exercising that remote service so that I know it’s still meeting its contract with me. So I’m not gonna break.
That’s the kind of thing I’d like to do. I’d like to run them automatically so I know if any changes to that contract happen then I’m aware of it and I can make changes to my own code.
The idea of consumer driven contracts is flipping that on its head. Rather than me as the client of the downstream service running the test, I still write the tests, but I hand them off to the service owners themselves. They run them as part of their own build –
Johannes Thönes: So you basically give them the possibility to run your tests against their system in their pipeline?
James Lewis: Exactly that, yeah. So you give them an executable specification of how you expect as the client their system to behave.
Johannes Thönes: Oh, okay.
James Lewis: Then on the other side, as the maintainers, the developers of the service that’s being used potentially by multiple clients, of course, I can take the set of these specifications say and, as you say, run them as part of my build. Almost my contract becomes whether I break any of the contracts of my upstream systems, which is a really nice idea. This has been taken further by – I think there’s at least two Ruby implementations now. One’s called PACT, P-A-C-T. I can’t remember what it stands for, but another one brilliantly is called PACTO because why not.
One builds in Australia and one builds in Brazil so I’m not sure what’s going on.
But people are starting to take this idea a bit further.
Another approach to dealing with versioning and this is a pretty crazy idea is actually you deal with it by traffic shaping. So this is to necessarily the public API boundary. I can talk about that maybe a bit later, but within your team, within your system you’ve got teams who are structured to look after these micro services. Just certainly in the example of say the Netflix’s or some of the other companies I’ve been working with.
As you’re looking after these services, when you’re making changes within your team boundary, then actually you don’t have to worry so much about versioning because the change horizon is much closer.
I wanna be able to make a change to the contract on a service that is only called from within the team really, really fast because all it involves is me saying, “Hey Dave, hey Sara, I’m gonna make this change. Is that cool?” And they say yes.
So you wanna be able to make really, really quick changes there, but when you start to push out to services that have contracts with other teams, you need to have some kind of level of stability.
Now what, as I say the Netflix’s and so on, have been looking at doing or are doing is using traffic shaping rather than the explicit versioning. So saying okay, well I’m plus explicit versioning I should say.
So in this model you have one version of your service that’s running and is “in production”, the current production version. Then you’d apply the next version of it also into production. You have two services in production –
Johannes Thönes: Two different versions.
James Lewis: Two different versions of the services, but actually the newer version is not live really ‘cause it’s not got any traffic. So then what you can do is you do all your testing against it. You do your product level tests. You make sure the product owner’s happy with what’s going on.
Johannes Thönes: Could you maybe ask a few early customers on them?
James Lewis: Exactly. To get involved and so on. Then what you do is you start using traffic shaping to actually ______ to switch people over or the team’s upstream can choose when to switch.
Of course, now that we’re moving to an age where cloud is ubiquitous, the idea that you can have many versions of the same thing running at the same time is much easier to comprehend and to actually execute on because we’re not limited by the amount of tin we’ve got in a data center. Eventually we’re limited by the amount that Rack Space or Amazon have got in their data centers, but still I forget where I’m going with that.
Johannes Thönes: Coming to an end is there any question you would have liked me to ask you or anything you want to just generally say about micro services?
James Lewis: Yeah. I think so. I guess in closing, my comments would be this idea of standing on the shelves of giants, this is not necessarily something that’s a completely new phenomena. It definitely isn’t. I don’t think anyone in the community’s claiming that all of these ideas are new.
It’s building on a lot of existing ideas and putting them together in such a way that they act like a force multiply. Adding all this stuff together into one place as a holistic whole with team involved with the ideas of Lean – we’ve already talked about Lean and product themes and these kind of things, with ideas from the service oriented world, with ideas from the dev ops world, programmable infrastructure. Putting all that stuff together makes I think a pretty compelling vision for how you actually start building systems for now.
There’s a lot of people who’ve been involved in this and maybe some shout outs would be appropriate if that’s appropriate.
So I mentioned earlier Dan North has been very involved with the idea of replaceable companion architectures. Brad George. Edgemont Cockcroft at Netflix has been talking extensively around this.
There’s certainly some big organizations in Australia. My colleague, Evan Botcher, who’s been leading some efforts over there.
Repair the market, we’ve been doing a really interesting job in the UK transitioning from one thing to another. It’s a pretty exciting time I think in how we’re learning to build software in this new world where the cloud is ubiquitous.
I think that’s the key game changer when it comes to the building applications these days. We haven’t even scratched the surface of how we can build systems.
I think the craziest thing I’ve heard in a long time is that, for example, Netflix don’t have _______. How do you do business continuity when the number of machines you’re running is into the tens of thousands. Where’s your backup plan.
That’s something that was an anathema five years ago is certainly an anathema to most DIA’s I would imagine at the moment. We’re still learning about how to do this stuff, but I think some of these ideas have got some legs and hopefully we’ll see some exciting stuff going forward.
Johannes Thönes: That’s great. I wonder where our listeners could find out more about it. I heard you are involved in writing a book.
James Lewis: Yes, that’s right. I am. Well, first off, obviously there’s Martin and my article, which is probably worth a look at. It’s on MartinFowler.com.
Johannes Thönes: A link to everything you say.
James Lewis: Okay. My blog, which I very rarely post to I’m afraid, but does have details of things like talks and tutorials and so on, is at Bovon.org. There’s a slightly long and involved story about nuclear physics there, which I won’t go into just at this time.
As you say, there is a book in the works that I’m looking at at the moment, but in the meantime I know that also colleague Sam Newman, who has also been heavily involved – in fact, the two of us you could almost argue have been the two in a box he’s always behind some of this stuff. He’s got a book. I think it’s in early access now at O’Reilly. So I think he’s hoping to get that out ______ before Christmas. So obviously take a look at that. Take a look at the open access version of it.
Apart from that, the Netflix blog is a fantastic resource about understanding the journey they’ve been on and some of the things that actually become possible when you remove the boundaries and really start to focus on building software at scale. That’s pretty exciting stuff.
So there’s probably a bunch more.
Johannes Thönes: Where can people find out more from you? Are you on Twitter?
James Lewis: Yes, of course I’m on Twitter. My tag is Boicy, B-O-I-C-Y. You can get a hold of me there.
You can also email me. The address is on Martin’s site at the bottom of the article. We would really like to collaborate with people as we understand a bit more about how to build this way. If you’re interested, please get in touch. I’ll do my best to answer any emails I get.
Johannes Thönes: Any talks, post-October, which is the probable release date of this show?
James Lewis: I think I’ll be – in fact, I am speaking. The reason I say I think is because I can’t remember the exact dates, but I’m speaking at Gotobilling on this topic. So there’s a great track. I think it’s gonna be really exciting actually. Alex, our grocer, has got a track on – it’s a faceoff model versus micro services. So that should be a really interesting track. So I’ll be in Gotobilling.
Post that, I’m not entirely sure. I think I’ll probably be focusing on the book leading up to Christmas.
Johannes Thönes: Then we look for a micro service happy Christmas. Thank you very much for your participation in the show. It was very interesting.
James mentioned you don’t have to type it in by listening. I will prepare the show notes. We at SE-Radio would really like if you could give us feedback to this show. So maybe write a comment in the blog or write in iTunes, for instance, what you liked the most and the least on this episode. That’s for me to say goodbye. That’s Johannes Thönes.
Male: Thanks for listening to SE-Radio, an educational program brought to you by iTripoli Software Magazine.
For more information about the podcast including other episodes, visit our website at SE-Radio.net.
To support us you can advertise SE-Radio by clicking the Dig, Rid-It, Delicious or Slash Dot buttons on the side or by talking about us on Facebook, Twitter or your own blog.
If you have feedback specific to an episode, please use the commenting feature on this site so that other listeners can respond to your comments as well.
This and all other episodes of SE-Radio is licensed under the Creative Commons 2.5 license. Please see the website for details.
Thanks again for your support.
[End of Audio]