A Primer on RESTful Web Services

Roy T. Fielding at RailsConf Europe 2007 (Photo by Patrick Lenz)REST (Representational State Transfer) is a set of architectural principles defining how a system's resources, and the states of those resources, can be managed over a network using only simple HTTP (Hyper-Text Transfer Protocol). The concepts behind REST were first introduced in 2000 by Roy Fielding at the University of California, Irvine. Nobody realized at the time how influential his dissertation, "Architectural Styles and the Design of Network-based Software Architectures," would prove to be.

In retrospect, it's not entirely surprising that the architectural principles Fielding presented in his paper were readily adopted. Fielding is the primary architect of the HTTP standard, a contributor to numerous other Internet standards and a founder of various open source tools, most notably the Apache HTTP Server. He has serious credentials, and if he claimed in his dissertation that there might be a better way to organize distributed web applications, then companies like Amazon, Yahoo and Google were prepared to listen.

Fielding's dissertation came at a good time, because there was rampant dissatisfaction in the web services arena. There was certainly room for improvement. The whole web services strategy, which had seemed like a simple way to organize distributed systems, had become increasingly complex. Standards like SOAP and the Web Services Description Language (WSDL) didn't really seem to help.

In many ways, REST was a deliberate return to the basics that made the World Wide Web itself successful.

The web arguably became successful because it was conceptually simple, yet powerful. In its purest form, there were web pages that had addresses (the ubiquitous URL), and users could view them using a web browser. A simple protocol like HTTP was all that web browsers needed in order to retrieve web pages. Furthermore, web pages supported links, which allowed users to traverse relationships between web pages.

With RESTful web services, the conceptual basis is resources, rather than web pages. A resource is anything that is important enough to be referenced as a distinct “thing.” For a company like Amazon, resources could be objects like products, publishers, customers, reviews, etc.

Since REST is really about accessing and manipulating resources, a RESTful approach for web services is also sometimes referred to as a resource-oriented architecture. Like web pages, each resource has a distinct URL from which information about the resource can be retrieved. Unlike web pages, it is possible to perform actions on resources, such as creating them, updating them, deleting them and viewing them. All of the actions that can be reasonably performed on a resource can be defined in a consistent manner using the HTTP standard.

HTTP Support for RESTful Web Services

The HTTP standard defines several different “methods” for performing standard operations on resources:

GET Retrieve a resource.
POST Create a new resource.
PUT Update a resource.
DELETE Delete a resource.

The GET and POST methods are already widely used by web browsers for retrieving web pages and for posting data from forms. REST goes one step further and stipulates that GET requests should be for viewing resources and should not cause any side effects, i.e. — a GET request should never cause a record to be created in a database. This was already considered to be a "good practice" in the web design field; adhering to this practice ensures that search engines and web crawlers won't cause harmful side effects by traversing links on web pages.

The PUT and DELETE methods have been part of the HTTP standard for more than twelve years. However, they have not achieved widespread usage until recently, which is clearly a direct result of the spread of RESTful web services.

RESTful Actions

Think of resources as nouns, where each resource is represented by a clearly defined URL. The verbs are the HTTP methods (GET, POST, PUT and DELETE). The nouns and verbs can be combined into sentences that define actions to be performed on a resource.

URL HTTP Method Action
http://example.com/books GET Get a list of books.
http://example.com/books POST Create a new book.
http://example.com/books/1 GET Show data for book 1.
http://example.com/books/1 PUT Update data for book 1.
http://example.com/books/1 DELETE Delete book 1.

In the table above, the various URLs allow callers to perform actions on a book or, more specifically, the metadata associated with a book. The primary advantage of this architecture is that it offers a regular and predictable interface for managing most types of resources. For developers, using a RESTful web service is as simple as sending a request to a specific URL using the appropriate HTTP method. HTTP even supports the capability to send arguments with a request.

But there's something even more important about how these requests are defined. The requests are stateless.

Each request is fully self-contained. All information needed for the processing of the request is included in the URL, the HTTP method, the HTTP headers or the content of the request. No knowledge of previous requests is required in order to perform an action.

The fact that the requests are stateless is vital for scalability. Run a load balancer in front of a fleet of computers, each running an instance of the web service. It doesn't matter which computer requests get directed to because the requests are completely self-contained.

HTTP Status Codes and HTTP Headers

Each web service request sent via HTTP results in a response being sent back to the caller. At a minimum, the caller receives an HTTP status code that defines the status of the request. The HTTP standard defines numerous status codes that can easily be leveraged by a web service to provide meaningful information to a caller. A few of the most useful status codes are shown below:

Code Name Description
200 OK An action was successfully completed.
201 Created A new resource was successfully created.
400 Bad Request The server did not understand the request, which may have been malformed.
401 Not Authorized The user was not authorized to perform the action.
403 Forbidden The action was legal but the server will not complete the action.
404 Not Found The specified resource was not found.
405 Method Not Allowed The request was made using an HTTP method not supported by the resource.
422 Failed Validation Originally defined by the WEBDAV standard. Indicates that the resource failed validation (typically on Create or Update).
500 Internal Server Error A generic error that is returned by the server when something goes wrong.

The status code generally provides enough information for a caller to determine whether a RESTful web service request succeeded or, if it failed, to gain some idea of why the request failed. In most cases, that's all the caller needs.

The Create action is the exception. When a Create action is performed, the caller has no way to know the unique URL of the newly created resource. Accordingly, it has become commonly accepted practice in RESTful web services to return the URL of the new resource to the caller via the "Location" HTTP header. For consistency, some web services do this for both the Create and Update actions.

Representation Formats

Both the request and the response may optionally include content. For example, when creating a new book resource, the request will need to include information about the book. Most web services accept such content in easy-to-parse formats such as form-encoded parameters (the format for data that is POSTed from a traditional HTML form), JSON and XML. That data format is referred to as the representation format of the request.

Likewise, most web services respond to a request by sending back content in easy- to-parse formats such as JSON, XML or various flavors of XML such as Atom, RSS or XHTML. That data format is referred to as the representation format of the response. Callers can parse the content provided in the response and reuse it for their own purposes.

Many web services allow users to select from a choice of representation formats for the request and the response. For example, a caller might send a request that included JSON content and receive a response that included RSS content.

REST vs. SOAP

REST and SOAP represent two very different strategies for implementing web services. A RESTful architecture makes web services as conceptually simple as possible, with a standard and consistent approach for representing actions on resources. REST also makes no assumptions about the representation formats, such as XML or RSS, used for content that is sent back to callers.

SOAP is more elaborate than REST in many ways. There are numerous official standards layered on top of each other, providing features for remote procedure calls, security, etc. There is no common way of representing actions on resources. Instead, there is a standard approach for representing any type of content in XML in a generic fashion. With SOAP interfaces, the strategy is to have libraries that automatically parse complex SOAP-related XML and create objects based on the content, which can then be interrogated to retrieve data elements as needed.

In simplistic terms, REST focuses on providing a standard interface for resources, while SOAP focuses on providing a standard and generic representation for content. With REST, developers may have to create custom parse logic for content received from a RESTful web service. With SOAP, content can be parsed automatically and made available to the caller. In practice, REST representation formats tend to be much simpler than SOAP-related XML, so custom parsing is not typically a problem.

In general, RESTful web services are usually simpler and easier to use than SOAP interfaces. Conceptually, RESTful web services typically seem more object-oriented to end users, which enhances their perceived ease-of-use.

Additionally, SOAP interfaces often do not work well in cross-language environments, e.g. – SOAP clients in languages such as Perl, Ruby, Python and other languages may not work properly with SOAP web services. RESTful interfaces are usually less complex than SOAP interfaces, so language compatibilities are less of a problem.

Most importantly, implementing SOAP interfaces can be much more time- consuming than implementing RESTful interfaces, particularly in languages like Java.

References

The following references may be useful in understanding REST:



Comments

David Keener By dkeener on Friday, September 10, 2010 at 07:43 AM EST

It's amazing to me the number of times that I've ended up incorporating parts of this article into government proposals.


Leave a Comment

Comments are moderated and will not appear on the site until reviewed.

(not displayed)