Transcript 98: Stefan Tilkov on REST

Host(s)

Markus

Guest(s)

Recording venue


In this episode we discuss REST (Representational State Transfer) with Stefan Tilkov. We started out by discussing the 5 steps to REST: IDs, links, Standard Methods, multiple representations and stateless communication. We then looked at how to use HTTP for REST, and discussed about how to use it for Web Services. We then we discussed whether and how to use REST for enterprise applications, and not just for apps on the internet. We concluded the discussion with a couple of recommendations.

Transcripts are brought to you by itemis

Transcript

Okay, welcome, listeners to another episode of Software Engineering Radio. We are here at OOP, another conference where we interview people, because doing live interviews with the new gear is much better than doing the remote stuff. This time we have Stefan Tilkov as a guest, and we are going to talk about REST and SOA and also REST for Internet applications I guess. So, Stefan. Welcome to the show.

Thanks for having me here.

So to get us started, why don't you give us a little bit of an overview what REST is and where it is used and how it fits into the picture of SOA and all that stuff that every body talks about?

Okay to start very briefly, REST first of all is an acronym; it stands for Representational State Transfer which is not the most accessible of terms. So this is mostly due to the fact that the term has been coined as a part of a dissertation. So Roy Fielding, a very well known guy in the Web community; actually one of the inventors of many of the Web standards invented the term in his Doctoral Thesis to describe one of many architectural styles. I don't think we should dive too deeply into all of this because it leads us down. You know a track where many people will criticize me for using the wrong terms to express that. What's important is that, REST is a set of core principles; REST is a set of principles that are independent from any particular technology so you could apply those principles to any architecture that you build. If you build something purely in Java, you could build it in a RESTful way according to those REST principles. But there is one instantiation of the REST principles that is the best known which is the Web itself. So HTTP and the associated protocols like URIs are essentially based on those principles, Roy actually wrote the dissertation after the fact, so some people accuse him of doing that, of inventing something posteriori in effect.

Yeah, of stealing --

Yeah, actually he has been involved from the start in building those protocols and he makes a very convincing story that actually those principles have been there and have been available without that name as the core principles underlying the Web.

Okay, so I just notice that I completely forgot to ask you who you are and to introduce yourself, but I can do this very elegantly; so what do you do in real life? What is your job? What do you do?

So my job, and the reason why I actually deal with this stuff or I am interested in this stuff is that I am actually working with a consulting company and we focus on software architecture, and while in the past we focused mostly on single systems, on trying to find the best way to architect a mission critical individual software systems, the focus has mostly moved now towards integration. So you no longer build a single system, you always build a system of systems and every system has interfaces to others. So I am naturally interested in different ways to doing that. So I have spent a lot of time using CORBA and other technologies and before we founded innoQ, which is the company I work for. And when we actually founded innoQ in 1999 we were doing CORBA and Java and then we did J2EE and then we did lot of web services stuffs. So for a long time, I used to be a strong believer in web services.

And with web services you mean more or less the XML, SOAP, UDDI, WSDL.

Yes main stream SOA technologies. What most people use for building SOAs these days. And I was intrigued by this REST thing which I had heard of and then I started to look into it. And there were very, very few people who actually talked about REST when I started to get to know it. One of the most vocal one is Mark Baker who did a tremendous service to the industry by very patiently and very, very consistently advocating the use of REST wherever possible, he tried to educate people and when he started that, basically everybody thought this guy is nuts, he keeps on ranting about the stuff that nobody wants to use in the real world and now we found that it generates so much interest because it turns out to be a very great thing.

I think Mark recently wrote some blog entry or said something; I read this on Steve Vinoski's blog who recently also converted to REST, which is a good sign for REST, anyway I read something that Mark has somehow has gotten demotivated or fed up with all this war against REST vs. official web service stuff. Before we look at REST as an architectural style and at its principles let's briefly visit the war and what it is all about and then not mention it any more. Just to get this theological, religious discussion out of the way. What is it that creates this conflict?

There has been criticism by the people who built the Web; when Web services were invented. So people started inventing Web services as a way to do distributed computing over Internet technologies, so let's use HTTP because it is deployed everywhere, because of the firewalls allow us to put every thing through HTTP, let's use that to built the next generation CORBA stacks. So nobody called it CORBA for obvious reasons, the COM (inaudible) CORBA was and stuff like that. People wanted to have the next Distributed Object Technology based on those things. And the Web folks said, look you haven't really understood the principles behind all of this, listen to us for a while and first of all understand what it is that you are building upon and build on it in a way that actually conforms to its architecture. This was basically ignored; many people in the Web Standards Groups and many people in the committees that actually standardize Web services did not have a good understanding of what the Web was about. Sounds very arrogant for me to say this but I fully admit to having belonged to that community myself. So actually I was totally convinced that we need a lot more beyond basic HTTP to build distributed systems. I mean how could you possibly build a distributed system on just HTTP, I mean you need lots of other stuff on top of that; that is what I believed. So this war was essentially between two groups of people - one of them said look we know how to build distributed systems, because we have been doing that for years with CORBA and with DCOM and with other distributed technologies, learn from us and we will help you bring the Web to its next phase and then we can use the Web for machine to machine integration, that was one group.The other group said, "Look, you folks have failed, you failed to build something like the Web on CORBA, all your great theories just don't work. Distributed objects are a failure, they are not a success." When I use my Web browser to connect to some web server I am not using IIOP, I am using a simple text based, simple Internet style protocol to do that. So listen to us, we know how to build really scalable Internet scale systems. So that was the war and that still is the war that is going on. Much of it is based on misunderstandings. So many people see advantages and disadvantages on both sides simply because they are ignorant of the respective features.

As in any war.

As in any war, yes, I fully admit that I am on the REST side now, very, very firmly; so I strongly believe that in most cases REST is the better option and I am hard pressed to find good scenarios for the Web service standards these days.

So then before we discuss the REST stuff in detail, it is important to just say that it is, of course, indented for Internet style or business systems. I mean if you want yo build -- just take the typical Doug Schmidt, example if you want to build a networked system for a warship or whatever he is always doing then that's, obviously, not something you would do with REST; but probably also not with web services. So it is really not about building all kinds of distributed real time systems, it is really about doing the typical distributed business/enterprise application or Internet application.

So I would definitely agree that it is not intended to be used in real time scenarios.

Whenever quality of service assurances are important and that stuff, where you need those more may be sophisticated protocols.

Maybe yes. So I would definitely say both are applicable in scenarios where one of your goals is loose coupling; so where you want to have systems connected but the world doesn't stop if paused or no body dies if the system doesn't respond in a second or a milliseconds, obviously, it is not real time capable. But both styles, I think, are applicable in a lot of scenarios and I think especially the REST style is mistakenly thought to be only usable in a very few scenarios, and I don't think it is.

Okay, so let's look at the details of REST; you have this nice list of five steps or five building blocks. So why don't we step through those.

So the first one is item zero, as I say is always -- I should actually be careful to not confuse REST and HTTP because those are two things; as I mentioned before, HTTP is just one example and you could build another system that is even more RESTful than HTTP; and so one should not confuse the two terms. Those are two things, but for our practical purposes if we talk about REST just in common usage people are talking about RESTful HTP, about using HTTP as intended and as described in the REST principles.

And we can maybe later briefly revisit the question what other system could be even more RESTful or what that means?

Right. So the five things that I think are important, is the first, is essentially you give everything an ID you decide to adopt a single consistent naming convention or identification convention for all of your important things or resources as they are called today in the REST world. On the web this ID is the URI, so essentially you identify things within an URI and everything that merits to be identifiable gets its own URI, and that is thus a very key concept. This doesn't necessarily mean that those things have to be static or that they have to be entities in a database or in a persistent sense, so you could have a process step or you could have a salary increase or you could have whatever virtual thing it is that you want to identify later on.

That reminds me a little bit of the early days of object orientation where people initially started like underline all the nouns in the requirements and you have your objects and then over time people noticed that there is additional classes and objects that you need to make the system work and you also include those things as identifiable things.

Exactly, I agree; so you have to think what the right resources are as you will see REST is constrained in other respects, so you really have to have more nouns, more things than with other styles. The second principle is that you use those IDs to link things together. That is something that we all know from working with the Web. So if I put one URI in my browser and hit enter, I get back some representation of a resource, that is the terminology, so I get a web page for example that contains links to other web pages, to other resources. So a representation of a resource contains links to others and linking, you know, the concept of hypermedia is probably the most important concept; of course it builds on the fact that you have IDs for things, so that you can link them together, but this link is a very important thing and it is not only important for human consumption. So not only if a user clicks on a link but you can easily imagine that a program can also follow a link and you can connect things together in new and specifically unforeseen ways. So I can aggregate things from a number of different locations together. I don't even have to see or I don't even have to be aware of where that link points to. So I can aggregate links to resources from different systems which is only possible because I have adopted a single consistent identification strategy, otherwise, it would not be possible.

Which is the URI

Right

And I guess one of the most let's say widespread representations is probably XML and then you can define a schema where one of the attributes or one of the elements contains a link to one of these other things. So that is the way how we would create links.

Absolutely, I would say that even the most widespread representation form is actually HTML because that is just a --

Really, I mean, okay, well, I mean, of course, for classical end user --

Yeah those end user applications and those machine-to-machine things are not as different as you think they are.

May be that is the point for us.

One of the key takeaways of this whole thing is that essentially if you have built a web UI according to REST principles then you are 75-85% of the way; you have already made 85% of the way towards machine-to-machine (inaudible).

But you still need the information probably to be a bit more structured, so your program can reasonably work with it. I mean if you have just crappy HTML that is completely polluted with layouts specific HTML crap that is not what you would use. So typically XML, I guess, if you build it dedicated for a machine-to-machine conversation, XML is probably more suitable, I guess.

Yes. Not to drift off too far from this particular principle we are talking about, I think what probably the sweet spot is XHTML, where you actually have a good choice, it is XML so it is parsable by a real XML process or by real XML parser and you get the benefits of having it both human readable as well as structured enough to put in information.

And you make sure that the layout crap is factored out into CSS, so it is not completely polluted.

Exactly.

So which is good style anyway.

Yes I agree.

Okay number three.

Now you have those links, you have those IDs; now the question is what can you do with an ID? I mentioned one thing in passing, which is that I can put the URI into the address bar of my browser and hit enter and this only works because all of those resources, all of those things or objects, if you want to, all support a standard set of methods. So I can do a get, an 'HTTP GET' on any resource that is identified by a URI. That is an extremely powerful concept because this 'GET', this 'GET' method, this 'GET' verb, as HTTP calls it has a certain set of semantics defined for it in the HTTP specification. If the person who implemented this stuff on the server side has followed the specification, I can rely on the 'GET' not having any negative effects for me. So an 'HTTP GET' will never oblige me to anything I will not have to pay for it, I will not have to -- I wont get an invoice two weeks later telling me that I have a 'GET' there; 'GET' should be safe that is the terminology. There are other methods namely PUT, POST, and DELETE; those are the four core methods in HTTP that actually also have specific semantics assigned to them in the specification. This is maybe one case where we can draw the line between REST and HTTP just to illustrate the point. REST says, you have to have a uniform interface which is the same set of methods for all resources, and HTTP makes that specific and says, the four methods, there are actually a few more, but the core methods are GET, PUT, POST, Delete. So this particular instantiation of REST has these four methods but it could be totally absolutely possible to have 10 verbs if you wanted.

And, of course, if you think about that URIs are more or less noun kind of things then you probably would need things that create, deletes, and gets those things. So it is a natural fit to have those four, so every other REST interface will probably be more or less probably somewhat similar.

Yes, I think you are right, I have never thought about that as much, but you are probably right, as you end up with something like that. There were some things, like, for example, in an earlier HTTP draft spec there was a verb called patch which was actually a partial update, which is a very reasonable thing to have and actually a guy called James Snell is actually at the moment working on reawakening an Internet draft that standardizes patch, again, so that could be added. The nice things about those four verbs is -- or about any, sort of, constraint verbs is that you are able build generic tools, generic clients, and generic services that could do something reasonable with that or intermediaries that could do something reasonable.

This thing with its limited set of verbs that is then generically applied and semantically reinterpreted for all those different things you have underlined before; that is something I was or still am little skeptical about, whether this is the right approach; but maybe at the end of this discussion I may be convinced.

We'll see.

Right.

So we have three of the five things that we've mentioned. We have the IDs so we have a URI for everything, we have links to link stuff together, we have the standard set of methods to apply to everything, to every object. The next important point is that you only interact with resources through their representations. So a resource is a concept that you never really see in itself, it is not really platonic. So you do a Get and you get back one representation of the resource and in HTTP you have the concept of content negotiation that enables you to actually ask for different representations. So, for example, if you have a customer that is identified by some URI, I can do a Get and say that I would like to have the text/HTML representation of the customer and I'll get back an HTML page which is exactly what my browser does when I put the URI into the address bar or I can issue a GET using an accept header that says I want to have application/XML. I will get back an XML representation. This concept is very important because it allows me to interact with the same resource in different ways using different representations.

And that probably doesn't just extend to using different let's say technical data formats; you could have probably also say I want, let's say, the short summary information about something as opposed to the more extensive stuff or is that kind of going too far?

I would suggest to use different resources for that. We are really entering the realm of design decisions here --

Yes, sure.

...you can model stuff different ways.

What is the best prectice

Yeah that is may be one of the things that are only just emerging and I don't think there has been any agreement on best practices; there is a great book by Sam Ruby and Leonard Richardson called 'RESTful Web Services', that is the one I recommend to everybody.

Oh, we will put it in the show notes.

Some guy made a nice joke at one of the conferences I attended lately, which was that the Roy Fielding dissertation is the Old Testament and the 'RESTful Web Services' is the New Testament. I think it was Sanjiva Weerawarana he was one of the most entertaining critics. So actually the New Testament or the 'RESTful Web Services' book has a set of well-known best practices. Whether that well known I don't know, but of best practices, I am not sure what it says with this aspect. My personal recommendation would be to use something like a sub-resource, this is not really a concept form REST but I would actually -- URLs can be, they don't have to be, but they can be human readable, they are hackable as people say. So I would just append and then identify at end to access the sub resource.

So you would probably still use this, so assuming, you use like integers to identify the various whatever customers then you would use the same ID for all those variations but you would append something like summary InfoBasic.

Right.

Okay, so that makes sense, because I would not like to change the actual identity identifier, you know, the number, that kind of feels wrong to me.

Well you have to get over your technical knowledge and forget that the ID is just the integer. I mean the ID is the URI and actually if I look at a customer that has an address and a history of purchases and I don't know what, a region he is assigned to as a sales representative the customer itself is a reasonable resource, so I have a URI to identify the customer and I get all of the information, but the customer's address is also a meaningful resource.

And that has a different ID.

Right and that ID does not necessarily have to be a hierarchical relationship to the first one; it can, because it makes it easier to understand the system, but it is not as if it is any part of REST in that respect.

Sure and there is certainly no technical requirement.

Right, it is a matter of design, yeah, your stomach is probably right, I have got a gut feeling of that.

It is also hungry.

Yeah, so maybe the last thing that is important as a core principle is that you are supposed to communicate state-lessly and that is just the -- it's the restriction on statefulness. This is often misunderstood because people think, "Well, if I can not have state how can I ever have a database that persists something". Obviously, it does not mean that you can not have a state stored somewhere, there is, obviously, a resource state. It just means that you should not keep any client specific information in a session. So sessions are totally opposed to REST's core beliefs, you should not do sessions. You should turn session state into resource state.

Okay isn't that in some way a little bit, I mean if you have a highly scalable web application the session state is persisted through database anyway, so what is the difference?

Which exactly proves my point, the thing is, we start --

That is what I am saying, there is state that acts as session state from its meaning, so it does not matter how you implement that.

No there is a huge difference because if for example, I shop at a website and I have my shopping cart which I add things to, in one implementation I can send you a link to that shopping cart because it is resource state, and I can say look Markus this is what I am planning to purches --

It is identifiable again.

Yes it is identifiable, I can link to it, I can bookmark it, I can send that link to somebody else. In the other implementation it is just an unnamed client specific state. I think the original motivation for having session state was that database actions were perceived to be too expensive. So we ended up putting that stuff in memory and then we noticed "Well, okay now I have it in memory but now I have this clustered system and I have to make sure if the client hits another server, it has the state". So I persist which is absurd, it shows that the core idea was the right one.

That is actually a good point and I have built web applications using this database metaphor myself.

So, you maybe doing REST or maybe 60% or 80% of REST, you may have been doing REST for a long time just without being aware of it. If you apply those principles, I think they were just good principles with their design. You also see many, many web applications, even web applications intended for humans who violate these constraints. One example is the 'give things an ID constraint'. So the identifiable resources constraint, because for example, especially in the Java space, if you look at many frameworks they just give you a single URI and tunnel everything through POST. So you never see a change in your address bar and if you send a link to somebody he will get the homepage because he needs to build the sessions, which is bad, I mean that --

Well, it depends on what you want to do; as an application designer, you might not want this temporary data to be available to others.

Well okay, if it is not supposed to be available to others then you protect it accordingly using authentication and that is something that you should do anyway. So, I don't think that is a reason for doing that, another thing that is very important which, is those principles are just the core guidelines and nobody says you have to conform to all of them all of the time. So, I mean we are all engineers and if we are building something it makes sense to relax those constraints if necessary.

So, I guess we successfully stepped through those five building blocs; lets look at why -- well of course HTTP and the Internet is a good basis for doing REST stuff simply because it has proven that it works. But is there other stuff about the HTTP protocol that's useful in that context. So, in other words, what are the features in HTTP that you use to build scalable, high performance applications?

So, first of all you have to be aware that HTTP is a very powerful protocol, it's a very big protocol, it's a complicated one, it's one that you and I would not design over a weekend. So, it took a lot of effort of many people to come up with that thing and it has a lot of features that many people are only superficially aware of. My favorite feature of HTTP is caching. So, caching is very sophisticated in HTTP and it is actually what makes the web so scalable, it actually enables you to retrieve the representation of a resource and in many different ways ensure that you don't retrieve the same information again unless it's necessary. So, for example, you can do a conditional GET, which means if you do a GET on a resource you get back a representation which might feature an ETag, which is a hash value over some internal resource state and you send back the next GET with that ETag in the header. Which says, send me the representation if it's different from this one and the server will reply with an appropriate response code that says "no it hasn't been modified".

How does this relate to the 'Expires' header.

That's also a part of the caching process, we have different ways to control caching in HTTP you can have time stamps, you can have expire headers that say for how long it is okay for a cache to hold the representation before it needs to ask again, that's a very, very sophisticated protocol and on the web maybe 95-98% of all actions are actually read actions, are GETs. So, it makes sense to optimize the GET, may be in an internal business application that if that number is not 98%, but it's definitely way over 50. So, it's very important to do that. You also have a very rich set of response codes in HTTP, status codes that actually have sense from an application perspective, because they say, for example this resource has moved it's now somewhere else or this resource is gone, which means it was there before, but now it is no longer, or this is temporarily unavailable, this is permanently moved, this is not found, this is in theory available but right now I have an internal server error, or this is okay, or I have accepted this request, I can't answer it right now but I may be able to do so later. So, you have lot of application level response codes in there, you also have those methods, those application level methods GET, PUT, POST, and delete. Probably the most controversial discussion is whether those are really application methods. They are application methods on a certain level. So, you could say that the most uniform interface, the most generic interface that you could apply everywhere would be 'process this', it is almost like do it, and the 'process this' has actually been popularized by Jim Webber and Savas Parastatidis, two very smart guys from the Web Service Community. They said we have this style called MEST, it's not REST it is MEST, I forgot the acronym, message driven something. So, it is inspired by REST and says, "I only have this process this and everything is in the message."

Basically you have a mail box into which you shove messages and then the message says what it should do.

So, that's one option, but the problem with this kind of interface is that it's so generic that it is meaningless; I mean it doesn't say anything; you have essentially made it so generic that all of the meaning is now in the message. REST takes a middle ground, it says, I have some level that is generic across all resources in all applications, which is there are certain things that are merely reads those map to a GET, there are certain things that update the resource, they affect the resource, they are applied to, you map them to PUT; there are certain things that lead to the resource being gone, there is a delete. There are things that rely on POST which is essentially on the one side create a new resource. So, it doesn't necessarily affect the resources applied to it, it might create a new resource and POST is also the generic cop-out method if you can't find anything else you can always (inaudible) POST.

It is a limitation to have only those four verbs, and that means that you either have to misinterpret them in some cases or you have to come up with artificial nouns, artificial things to be able to address them, I mean how do you (inaudible) --

Yes, that's the point we had before. I disagree with that, I would actually try to convince you that you have to think of it in terms of object orientation and inheritance. So what you actually do is you derive your application specific interface from the REST application interface. Which means that you have to take care not to violate the semantics associated with those four things. It is the list of substitution --

Right, you can specialize.

So you specialize. And it is perfectly fine as long as your resource confirms to the HTTP resource protocol. It is a valid HTTP resource, it may also be a valid "Markus" filter resource, a filter ".org" (inaudible) specific application resource. If I am a client, I can actually be a client on two levels. I can be a generic client or a cURL or Wget or I am a web browser, then I can interact with your resource, based on my knowledge of HTTP specification and that is also true if I am a cache, for example, or if I am a proxy or if I am the Apache Web Server or any of the other tons of reasonable and very well tested and efficient Web software pieces. If I am a specific client to your application, then I know what I actually do, if I do one of those methods. So I agree there is a limitation. You have to design differently, you end up with more nouns instead of more verbs, but you gain something, which is -- obviously, I mean you have to have a good reason to accept limitations which is that you gain this widespread generic applicability, you actually become part of the web instead of tunneling the stuff through it.

Right.

Let me just add to that. That is the one major point that the Web folks have criticized in the Web services approach because the Web services approach tunnels everything through POST. I mean that is not true for SOAP 1.2 in theory, although, in practice everybody still does it in SOAP 1.2. In practice everybody does that. In practice you also have the concept of an end point which is that you identify your application with an ID and then you have a set of methods in your service that sits at that particular end point. So essentially you ignore much of the aspects of HTTP which is only -- I mean you have to because Web services are supposed to be agnostic to protocol, so you cannot exploit HTTP -- there are simply no way to do that. If they are supposed to be able to run over TCP --

-- or a messaging middleware.

Right, you cannot use any of the features of HTTP to do that. So there is simply no way to consolidate those two verbs.

Anything else about the HTTP protocol that is worth mentioning?

Well, I have to say that it is supported everywhere, absolutely everywhere. I mean every client, every library, every programming language has support for HTTP. It is extremely well tested. There is some great software, for example, I would say that the HTTP server the Apache HTTP server and the Squid Proxy cache are probably light years ahead or years ahead in terms of quality and efficiency and maturity as opposed to any ESB you can buy today. That's software that runs half of the Internet, half the web runs on Apache HTTP. So there are many, probably lots more. Just to mention some, you have built in compression capability. So the client and server can negotiate whether or not to compress the stuff. You can chunk data transfers. If you have the bandwidth there is no problem to download a 1.2 Gigabyte Oracle installation image over HTTP. You do not do that over Web services and you do not do it over CORBA. So there are a lot of things that just work and that have been established over the course of time for HTTP.

So let us look at a little sample application or a sample case study and contrast the Web services approach with the REST approach just to make things clear. So I guess the example is the usual order customer stuff. So why don't you outline the solution that you would do with CORBA, Web services, WSI, whatever.

Let us just imagine that you want to have a system that handles orders and you'll typically have a order management interface. It is the order management service and if you do a CORBA style interface or if you do the typical Web services mainstream style interface, you'll have multiple operations in this interface and these operations are specific to the task at hand.You want to build an Order Management System. So you have one method that accepts an order, let us call it 'submitOrder' and as a parameter it gets an order object and it gives you an ID, which is of course a proprietary ID in the scheme of your Order Management System. It will be 4711, that is the ID you get back if you submit an order. And then you have to pass in that ID if you want to inquire about an order. So you can have a 'getOrderDetails' and you pass in the ID of the order and get back an order object or an order XML block, whatever, it does not really matter. So you have those operations, let us say 'submitOrder', 'getOrderDetails', 'listAllOrders', 'delete' or 'cancel Order', you probably do not want to delete, you want to cancel it so mark it as delete or whatever. So that is a typical approach; what it means is that you end up with such an interface for every specific application that you build, which is exactly what it is designed for and you also typically have some sort of description language which may be code, in the case of RMI or it may be an IDL as in the case of CORBA or DCOM it may be some WSDL file that says this is the way I describe it with XML schema and with the stuff that WSDL adds on top. And if I map that same scenario to a RESTful HTTP approach, I actually end up with more nouns than verbs. So I will have orders and every particular order will have its own ID, so the ID concept that is proprietary in what I just described will become the main stream ID concept which is URIs in the RESTful HTTP approach. I will also find nouns for the collections of orders so that is one of the typical best practices, it is not a part of HTTP or part of REST but it is something that has been established over the course of time. So I will have a collection resource that has all my orders and I have another collection resource maybe for my customers and for other stuff.

So this is like Eric Evan stuff, he has a repository for stuff.

Which does the Domain Driven Development?

Right and you have the things themselves; you have entities and repositories and both are things, objects, and here in this case they are both resources. You can interact with the collection or with the single thing.

Right it is a composite pattern in this case, I think. So every collection resource is also a resource that contains resources that might be probably there. But I don't know the DDD stuff that well so I can't really comment. So you have more nouns, now; you have in the case of the first service the Order Management Service you would have a single URI if you deploy it as a web service. In this case we now have an unlimited number of URIs because every order and every customer and every other object in your system has its own ID, and now we have the methods. So now we have something like 'submitOrder', how do I 'submitOrder'; 'submitOrder', if I try to find the matching method in the HTTP application protocol would be a POST because typically I use a POST to create new resources, and the target of that POST. The URI that I post to should be the containing resource, which in this case is pretty obviously the orders collection. So I post an order to the orders collection, I get back in the location header the ID of the newly created order.

In the example you have here, we are actually reading this from a bunch of slides, cheating here; you also have this hierarchical URI where you express that there is a bunch of orders for a given customer. So the URI would be '/customers/customerID/orders' so that is exactly this way of hierarchically providing more details about things.

Although I have to warn against -- one of the things that many people start when they adopt REST is they become fanatic about URI design and they spend like 10-hour design sessions in coming up with right way to structure your URIs. Now I love readable URIs, I think that is a great thing, and I love hackable URIs, so if I look at the URI in the browser it is great if I can cut off the last segment and get something meaningful back. They are hackable and that is good; but that is not in anyway as important as many people think, it is good to have readable URIs just as it is good to have readable class names and readable variable names but you shouldn't get too fancy with that. So there is a huge difference between those two things. In one scenario I have a service deployed at a single end point and the value that I add to the web is essentially one URI that I can post to and I can only post to that if I know exactly what application specific is behind that. There is no way for a generic client to make any sense. You could build a generic client if you have the WSDL, you could generate some UI that could -- but it is like I try to become part of this great big Web and I just hide everything in this great room behind a door that I can only go to if I have the key for it - that is the model. If I have the RESTful HTTP approach, I add a gazillion new resources to the web that can be linked to; for example, I can send you an e-mail and include a link to an order, I can build a web page, an HTML page that has that

I guess, here is my point; if you have built a system that is intended to be used on the Internet and integrate with the Internet, so you can send links to resources using email, then, obviously, using the RESTful approach is better for the reasons you outlined. I mean there is no arguing about that. However if you build a generic distributed system that is maybe not on the Internet that is used within an organization may be not even this enterprise wide thing, maybe it is just, I don't know, flight management system which connects info terminals and their data backend or something. Why is this an advantage that you are integrating in some way, shape or form with the web, I don't see an advantage there?

It is a very good question, I would argue that organizations internally more and more resemble the characteristics of the Internet, so they combine in new ways, they re-structure. You have mergers and acquisitions, things change all the time, the business becomes loosely coupled because you cannot rely on the same people being there on the next floor because they might work for a different company or their work might be offshored to India tomorrow. You have no idea what happens to your company, so it is a good thing to build for loose coupling, as the Internet does. Then another thing is that, that was a very good point made by Pete Lacey, another well know REST guy and in one presentation I recently attended, I will just add people as with the PC revolution people become used to certain things in private or the external use, and then they carry that within the enterprise. So the fact that I can do a Google search and find information on the Internet might make people think why it's so hard for them to find information in their own company. So if everything were exposed as a resource within the company, never mind the Internet, just within the company and you have a single Google Appliance that can work because everything is exposed as a resource, everything supports a GET and everything returns a readable format - then you could actually do a Google search across all of your company's information system. You can use Microsoft Excel to pull in an HTML table from resource and do calculations on Excel; there are lots of benefits that you can gain from that.

So your point is basically that there is no point in distinguishing between an enterprise system and the Internet because both are actually solving the same problem, so it is good to consider the enterprise a web like environment.

Right; to a large degree, I think that is true.

There's security and stuff like that--

There are issues; somethings are even easier -- if you are within a company it might be easier. You don't have to defend yourself against those Denial of Service attacks, who knows, maybe that is going to be a problem within the company as well, you know, may be not, as intentional but you might end up with the same kinds of attempts.

So I think we discussed the building blocks of REST, let's summarize why people should care. Why is it that people should maybe actively try to learn REST or inform themselves about the differences between the two approaches. How do you become informed and why should you?

So first of all I think that the RESTful approach has proven to be very scalable, very useful, very successful simply because this is simply shown by the web success. If you talk to somebody 15 years ago about the goals of the web everybody would have told you that you would be crazy to expect this to work, and it works very well. We are all relying on it and it has really been a disruptive change within the way we use IT systems. So that alone is a good reason; another thing is that even if you end up building your system-to-system communication using web services maybe for political reasons, because you have no other option or maybe because the tooling and the skills of the people you have is just supportive of this and not as supportive of REST stuff - even if you do that, being aware of the REST principles will make you build better web applications for human consumption, because many people just don't know about these things. They use a GET to change something, it is a bad idea; if I link to that resource I am not really linking to resource, if somebody follows that link they will invoke (inaudible).

That is going to get (inaudible) --

Right, something is wrong there. People just tunnel everything through POST because they are not aware of any others. So even if you only do it for that reason, it is a good thing to do that. I do believe that the main argument is that if you look at the history of distributed computing in the last 15 or 20 years you can see that there has always been this goal of this ubiquitous system of this set of principles and technologies that is available everywhere, so that everything can connect to everything else. I don't know, you probably remember those of Orfali & Harkey - Intergalactic web of objects, those CORBA books, do you remember?

Yeah, I do, of course.

That was the first time -- when I read them I loved them. I thought this is a great vision. I can have any object on the planet talk to any other object on the planet and we all know how that turned out with CORBA, it may be because of political reasons; but it still never worked out the way people thought --

Sure but I mean there are two different problems. One problem is that the fine grained object stuff simply doesn't scale. So even if you build a system with CORBA people don't build distributed objects anymore, they build distributed "facades" or whatever you want to call this stateless typically; so that is an architectural insight, well, just everybody knows about. So that is not specific to REST or anything.

I agree. What I wanted to say is though that the vision that those folks had back when they were total fans and I had drunk the CORBA Kool-Aid. That vision actually is much more realized in HTTP than in any of CORBA's successors, because I really can now access any resource in the planet. I can call it an object and actually HTTP has much similarity to object orientation.

Yeah, the nouns.

The nouns, right the nouns, it has the standard set of verbs that's not common in Object Oriented systems but it also relies on the Internet style way of doing protocols, like text-based and very simple, very easy to understand and I think it has actually proven that this is a very good way to build those things. So, I think it's definitely worth a closer look and we have seen great success in building systems.

So, what do you do if you need transactionality, security, and all that stuff wherein the WS-* land, you have all these fancy standards, how do you compensate for not having those things in a REST environment?

So, first of all, let me say that I am guilty of using the term WS-deathstar which is -- it has been invented by David Heinemeier Hansson. I have to admit though that since I have used that while back the universe of the web services standards has become much more reasonable, so many standards are now gone simply and have been consolidated so it is much better than it was two years ago. You can definitely see that, but still I think that many of the standards although pretty good, for example, WS Security standards are not bad at all, are simply not used in practice as much as people like to believe. So, for example, there is a good case to be made for message based security which is supported by WSS; but in practice, it's often way too expensive and not worth the efforts. So people use SSL, people build web services and use HTTP and SSL and HTTP basic authentication which is exactly what you would use in the REST based approach too. So there is not really a difference there. Transactions is another good example, I think transactions in a loosely coupled system are generally a bad idea, specially, of course, if you are considering the atomic kind or two PC kind of transactions, there are other transcation protocols compensating transactions and things like that. I am not a big fan of putting that into infrastructure; I think that's application logic. So I have never run into that transaction problem as much because I think you have to think about loosely coupled system.

In loosely coupled systems there is not much infrastructure can do, I agree.

So, some people disagree, I am open to debate here; I am over pitching there and if you have, for example, a case where you need to integrate existing systems that support two PC transactions using Open XA, you may be better off using something else than REST. There is also a very important point, there are lots of cases where RESTful HTTP is not the best option but I am doubtful whether the alternative should then be web services.

Right, absolutely.

Sometimes CORBA might be a better solution because we have an existing CORBA systems; sometimes JMS a messages system would --

That's what I meant, I think there is no point in trying to play REST against -- I had the extreme example with real time quality of service in CORBA.

Right, okay, I agree now.

Or if you go back to let's say messaging backbone, application infrastructures in a big enterprise, REST is really an alternative for cases where you would use WS-* stuff; that's what you contrast it to --

Yeah, I, now understand you; that's perfectly reasonable.

So, before we wrap up this thing, I would like to ask one thing about tooling; you mentioned it before in passing, there are all those web services tools where you pass in a WSDL file and it generates all those stubs and stuff. So, what kind of tool support is available for REST and maybe the question I should ask before, how much tool support is actually necessary considering that it's Internet protocol based?

One of the nice things that has happened in the last few years in the web services space is that there are two concerns with regards to tooling that have become separated; that were merged together before, which is the separation between the actual invocation and the data binding. So, earlier versions specially in the Java land had a single approach for that. Now you have with JAX-WS, the successor to JAX-RPC you have the choice of using your data binding, basically any Java web services 2.0 could allow you to do the same thing, you can use any data binding and, of course, you can use exactly the same data binding tools with REST.

Yeah, assuming you do XML because most of these data binding tools are XML.

Yes, but you also have data binding tools that allow you to do JSON to objects and -- if you are a fan of data binding and I am not, if you are a fan of data binding you can use the same tools in both worlds, the same things. So, the question is do you need something that generates a stub for you that has 'submit' method; the difference -- it's not that big a difference if you --

You don't have those methods anymore. It's all GET, POST --

Yes, in RESTfull HTTP you use the same methods all the time.

It is kind of a generic submission with the action thing --

Right. With that said, I am actually a member of the expert group on JSR-311 in the Java space which is the 'Java API for RESTful Web Services' and that's an API that is in a typical current Java fashioned annotation driven and gives you a very easy way to expose Java objects as RESTful resources and in contrast to many other standards in this space, it's been done by people who really, really understand REST and I am not talking about myself, but I am talking about other guys there who really understand it and we are trying to avoid all the typical mistakes and even Roy Fielding is involved with this, but he is not actively, but he is doing all of that and making sure we are not doing anything dumb there. So, I am not that big a fan of annotations in Java, so I have not promised with that as well but still I think it's a good way to enable mainstream programmers using their mainstream programming language to easily build systems that conform to those constraints. So I think the tooling is getting better and even within the web services space where you have that much tooling, you often end up building worse systems because of it, so I am skeptical of any sort of tooling ayway.

Okay, so let's wrap up this episode, may be with summarizing a couple of important takeaway points; why don't you just do this. I guess you know what they are.

So, one of the things is that many of the promises of the web services on WS-* space I think are questionable and you should really ask yourself whether these are really things that you need and things that you really gain some benefit from. So my favorite example would be protocol independence which is something that is so leaky -- you are not abstracting away your protocol at all, if you have ever tried to build something that works equally well over JMS and HTTP you will find that this protocol independence is not really that true.

Yeah, I think that you need to distinguish between abstracting away the protocol as an implementation of an interaction paradigm and trying to abstract away the interaction paradigm, you can't do that.

Exactly, that's the point.

Exactly the point, so it has to --

JMS is so totally different than HTTP, maybe --

But you can abstract away messaging infrastructure, the one you choose and say you have -- you just know it's messaging.

Yeah, I know but if I just know it's messaging I can use JMS, I mean that's an API I could use.

I have customers where they have different languages and they can't use JMS. I don't want to spoil your (inaudible) --

Well the first thing is there is a promise associated with web services that I think is questionable because you can either use this stuff that web services are supposed to encapsulate or you can go down the full RESTful HTTP route. After all web services are supposed to create a single widely deployed set of protocols that everybody can use. Personally, I would rather use a set of widely deployed protocols that's already there, then waiting for another one to become available that adds no value for me. I think that is also a Mark Baker quotation, I think protocol independence is a bug and not a feature.

Yeah, I saw that before.

The next one would be, I think it's perfectly fine to standardize on standards. So many people are afraid to standardize on HTTP, which I don't get. I mean you don't become dependent on any single vendor, you pick the standard and stick with it, it's like wanting to abstract away whether you store your data in an XML database or in a relational database or in flat files. Sometimes you have to make decisions and you shouldn't be afraid to make decisions and you just pick one of them, and it is perfectly fine to do XML over your messaging system of choice. If you have decided to use, I don't know, WebSphere MQ everywhere, which is something that many companies do, why would you want to abstract that away to no benefit, it doesn't really give you much. And maybe the last thing is I think you should really understand the web's architecture even if you later decide not to use it, you should understand it and my obvious recommendation would be you should understand it well enough, so that you could exploit its benefits; for example, all of the scalability things, the interoperability, the caching, chunking, compression - all of that stuff, I think, it really leads to you being able to build better systems.

Okay, very good, so is there anything else you want to say - pearls of wisdom you want to leave with our listeners as the guy on the space show always says.

Pearls of wisdom, I don't know, but maybe one of the things is that I am really happy that I think it's now starting that we are moving beyond this versus that war. It's like this has not been very productive. I think it was necessary for some time to make people aware of REST; now they are, and now I am really happy that we can move on and actually try to come up with those best practices that you mentioned with typical solutions, typical problems and I am really looking forward to the next few years.

Very good then I thank you very much for being on the show.

Thanks for your time.



Syndicate content