EPIC Web Service API Definition

Copyright: © 2009-2011 SARA Computing and Networking Services
  License: Creative Commons Attribution - Share Alike 3.0 Unported
   Status: Draft
  Version: 0.7
  Authors: Pieter van Beek, SARA
           Eric Auer, MPI Nijmegen
           Hennie Brugman, Meertens Instituut

Abstract

This document proposes a common interface for RESTful web service implementations (simply called "the interface" or "the API" hereafter) built around the Handle System. TODO @Hennie: complete this abstract. -PieterB

Introduction

TODO: @Hennie: introductie -PieterB

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC2119.

This document describes a RESTful web service, using the HTTP/1.1 application protocol. The API strictly adheres to the rules of safety and idempotence laid out in §9.1 of RFC2616: GET is guaranteed to be both safe and idempotent; PUT and DELETE are idempotent, but not safe; POST is neither safe nor idempotent. Extension "POE" offers a method to circumvent the non-idempotence of the POST method.

Glossary

For clarity end brevity, some terms in this document have a very specific meaning:

the API: the Application Programming Interface laid out in this document
server, implementation: an implementation of the API
implementor: an entity that builds an implementation the API
(service) provider: an organisation or person that operates a server as a service
client: a piece of software that interacts with a server using the API
user: an organisation or person that operates a client

Representation

JSON was chosen as the primary representation of resources, that all servers MUST be able to produce and consume. Implementations may, however, be able to produce and consume more than one representation of a resource. Servers MAY be able to produce and consume additional representations, like XML or DER-encoded ASN.1. In order to promote interoperability, implementors SHOULD publish additional representations in extension documents as explained below. This document defines one additional representations: XHTML.

In order to promote consistency between representations and implementations, this paragraph contains three sections:

  1. Atomic Types describes a set of simple atomic types that can be composed into more complex data structures.

  2. JSON representations describes the way in which the atomic types are to be represented in JSON.

  3. Abstract Data Model describes the complete data model of the service, in terms of atomic types.

Abstract Types

The following abstract types are used in the remainder of this document:

object: an unordered set of stringvalue-pairs with unique strings.
value: an object, list, string, blob or number.
list: an unordered list of values.
string: a sequence of zero or more Unicode characters.
blob: a sequence of octets.
number: a signed integer. Servers SHOULD support infinitely large integers. Servers MUST support 64-bit signed integers from ­–263 to 263–1 inclusive.

Extensions defining additional abstract types MUST specify how these types are represented in various representational formats defined in this document and registered extensions. Or should extension just restrict themselves to these abstract types? ––PieterB 2012/02/20

JSON Representation

This API uses JSON as the primary exchange format. All implementations MUST be able to produce and consume JSON.

Representation of the atomic types is as follows:

Atomic typeRepresentation in JSON
stringJSON string:
"¡Holá!"
blobJSON string, containing the octet stream in base64-encoded form:
"SGVsbG8gd29ybGQh"
numberJSON number:
-42
listJSON array:
[ "¡Holá!", "SGVsbG8gd29ybGQh", -42 ]
objectJSON object:
{
  "Grüße" : "¡Holá!",
  "data?" : "SGVsbG8gd29ybGQh",
  "number": -42,
  "list"  : [ "¡Holá!", "SGVsbG8gd29ybGQh", -42 ]
}
collectionJSON object. The index keys are pct-encoded if required, so they conform to the form segment-nz as defined in section 3.3 of RFC3986:
{
  "Gr%C3%BC%C3%9Fe": "¡Holá!",
  "data%3F"        : "SGVsbG8gd29ybGQh"
}

For compatibility with a broad range of clients, implementations are encouraged to support the unofficial MIME-type application/x-json as an equivalent alternative to the officially IANA-registered MIME-type text/json.

XHTML Representation

Implementations are encouraged to produce XHTML 1.0.

Representation of the atomic types is as follows:

Atomic typeRepresentation in JSON
stringXML PCDATA or CDATA:
Dr. Jekyll & Mr. Hyde
<![CDATA[ if (a < b && a < 0) a = b; ]]>
blobXML PCDATA or CDATA, containing the octet stream in base64-encoded form:
SGVsbG8gd29ybGQh
numberXML PCDATA or CDATA with a text representation of the number:
-42
listAn XHTML unordered list:
<ul>
  <li>¡Holá!</li>
  <li>SGVsbG8gd29ybGQh</li>
  <li>-42</li>
</ul>
list of objects with
equal key-sets
XHTML table:
TODO
objectTODO
collection of objectsTODO
collection of collectionsTODO

Abstract Data Model

collection: an unordered set of stringresource-pairs, with unique strings. See also the section on Resource Collections below.
resource: a collection or an value.

Everything in this paragraph follows directly from RFC3651. It's only mentioned here for clarity, consistency and brevity in the remainder of this document.

From §2 of RFC3651

naming-authority: a string consisting of dot-seperated substrings of any Unicode character except dot '.' or slash '/'.

local-name: a string of Unicode characters.

handle: a string consisting of a naming-authority, a slash “/”, and a local-name. See §2 of RFC3651 for the syntax and semantics of a handle and its parts.

Recent versions of the Handle System provide the possibility of Template Handles. Naming Authorities that use Template Handles must define a Template Delimiter Character for their namespace, which divides handles into a base part and an extension part. CNRI suggests the use of the at-sign "@" as Template Delimiter Character. TODO This may no longer be accurate --Pieter van Beek 2012-02-09

From §3.1 of RFC3651

handle-value-set: a collection with the following members:

Extensions may define additional members.

value reference: a pointer to a Handle Value: a string consisting of the string representation of a non-negative number, a ':' character, and a handle. See §3 of RFC3651 for more information about value references.

handle value

a collection with at least the following members:

Extensions may define additional members.

The Handle System comes with its own authorization scheme. Services which do not respect this scheme SHOULD NOT relay any Handle Values or other information related to this scheme, such as HS_ADMIN values or the bit-mask described in §3.1 of RFC3651. Services which do respect the native Handle System authorization scheme SHOULD implement Extension 1.

Multistatus

Iets over multipart/mixed for returning multiple statuses. -Pieter van Beek 10/25/11 11:30 AM

Iets over batch operaties, waarvoor ze wel en niet bedoeld zijn, alternatieven, usecases -Pieter van Beek 2/3/11 4:28 PM

This data type is only of importance to clients and servers that wish to use batch-wise operations which may affect multiple resources. In response to an HTTP-request that triggers such an operation, the server MUST respond with HTTP/1.1 207 Multistatus, and return a representation of a multistatus object, explaining which resources were affected and/or which errors occured. If an operation is defined as being atomic, and errors occur for some URIs targeted by the request, then the operation must fail entirely. Resources which failed to be affected because other resources failed to be affected within the same atomic request MUST fail with status HTTP/1.1 424 Failed Dependency as defined in RFC4918.

A multistatus is a list of collections with the following members:

Those familiar with WebDAV will recognise the structure of an XML -element as described in §13 of RFC4918.

Example

The clients submit the following batch request, which should affect multiple resources:

POST /NAs/10/handles/
Host: example.com
Content-Type: application/json
...

[ { "handle" : "handleOne",
    "values/": { "1": { "type": "URL",
                        "data": "http://www.example.com" } } },
  { "handle" : "handleTwo",
    "values/": { "1": { "type": "URL",
                        "data": "http://mail.example.com" } } } ]

The clients wants to affect two handles, with local names "handleOne" and "handleTwo" respectively. The server supports atomic batch operations, and replies responds as follows:

HTTP/1.1 207 Multistatus
Content-Type: application/json
...

[ { "href"  : "handleOne",
    "status": 403 },       /* HTTP/1.1 403 Forbidden */
  { "href"  : "handleTwo",
    "status": 424 } ]      /* HTTP/1.1 424 Failed Dependency */

For some reasen, the client didn't have permission to create/update the value set at /NAs/10/handles/handleOne. As a result, the resource at /NAs/10/handles/handleOne was unaffected as well, because the operation was defined as atomic.

Core API, Extensions and Representations

This document aims at defining an API that can be widely adopted. This implies fulfilling some conflicting demands. The API has to be complete enough to be usable by a wide variety of users, but also simple to implement, so that multiple implementations can and will coexist. The API must be extensible, allowing features to vary between implementations, serving particular user groups. Still, the API must guarantee a level of uniformity which allows easy migration between implementations.

To fulfill these demands, this document merely defines a Core API, which explicitly leaves room for functional extensions and extra representations.

The Core API is the part of the API that all providers must implement. In other words, clients which restrict themselves to the Core API are guaranteed to be interoperable between implementations.

Extensions allow for extra, optional functionality. Our hope is that implementors will collaboratively define such API Extensions whenever they —otherwise independently— implement similar bits of additional functionality, thus enhancing compatibility between service providers.

Representations are extra resource representations supported by the service, in addition to the obligatory JSON representation. Servers may support these representations in HTTP/1.1 request bodies, response bodies, or both. For example, while JSON is commonly used in both request bodies and response bodies, the application/xhtml+xml data format is normally only used in response bodies, while the application/x-www-form-urlencoded format is normally only used in request bodies.

Core API

URI Space Overview

The service roughly consists of the following URI space:<!--┃┗┣━-->

«root»/
┣━discovery/
┗━NAs
  ┗━«NAsegment»/
    ┣━handles/
    ┃ ┗━«LNsegment»/
    ┃   ┗━...
    ┣━profiles/
    ┃ ┗━«profile»
    ┣━status/
    ┃ ┗━«id»
    ┗━templator

The text below specifies the GET, PUT, POST and DELETE methods available per URI.

Resource Collections

Within the service's hierarchical URI space, many resources are collections of other resources, like a directory or folder. Such a container resource is represented by an unordered set of URIref→value-pairs called a collection.

Meer uitleg, meer voorbeeldjes -Pieter van Beek 2/3/11 4:41 PM Mention the slash as separator -Pieter van Beek 10/25/11 11:00 AM

If the URIref in such a URIref→value-pair is a relative-ref as defined in §4.2 of [RFC3986], then it is relative (with decreasing precedence as per §14.14 of [RFC2616] and §5.1 of [RFC3986]) to:

Request Depth

When performing a GET request on a resource collection, the user may specify an optional Depth: request header, specified in §10.2 of [RFC4918]. The value of this header is interpreted as follows: * "0": unused * "1": return the collection, with the value in each URIref→value-pair reduced to a "display name"-string, for example an unescaped version of the referring URIref. * "infinity": return the collection, including all child collections, recursively, ad infinitum.

For all URIs, the default request depth is "1", unless otherwise specified. Servers NEED NOT support all request depths on all URIs. In particular, servers need not support Depth: infinity on high-level collections, as this may generate a very large response.

Trailing slashes

Container resources have a trailing slash at the end of their (canonical) URIs. If the client accidentally omits this trailing slash, the server MUST do one of the following:

NamingAuthorities and Suffixes in Path Segments

The URI space of this web service uses Handle NamingAuthorities and suffixes as URI path segments. According to [RFC3651], these NamingAuthorities and suffixes consist of UTF-8 encoded strings of printable Unicode characters, while [RFC2396] and its successor [RFC3986] allow only the following ASCII characters in a path segment:

"A"-"Z" | "a"-"z" | "0"-"9" |
"-" | "." | "_" | "~" | "!" | "$" | "'" | "*" | "&" |
"(" | ")" | ":" | "+" | "=" | "," | ";" | "@"

Therefore, all other octets must be percent-encoded as explained in §2.1 of [RFC3986]. For example, a space character (ASCII character 32 in decimal notation, or 20 in hexadecimal notation) must be encoded as "%20" in path segments.

For maximum compatibility, clients SHOULD also escape the ";" character, because it has a special meaning in the (now obsolete) [RFC2396]. Servers SHOULD allow unescaped ";" characters.

«root»/

All URIs share some root, determined by the service provider, eg. https://example.com/epic_web_service/ * GET SHOULD return a collection.
Uitleg, voorbeeldje. Misschien alle collections -Pieter van Beek 2/3/11 4:41 PM

«root»/NAs/

«root»/NAs/«NAsegment»/

«root»/NAs/«NAsegment»/handles/

«root»/NAs/«NAsegment»/handles/«LNsegment»/

This URI points to a Handle, which is of type collection. Each member of this collection has its own URIref, and therefore its own URI within the service's namespace. This document doesn't describe these URIs in further detail. In general, all these URIs SHOULD support the GET, PUT and DELETE methods.

«root»/NAs/«NAsegment»/status/

The Core API itself doesn't include any asynchronous operations, but it does provide a framework for such operations, which can be used by extensions.

Whenever a request cannot be handled synchronously, the server MUST respond with an HTTP/1.1 202 Accepted status response, create a new "status resource", and return the URI of this resource in the Location: response header. The .../status/ resource is intended as the container for such status resources.

Since a new status resource is created for each asynchronous request, this request is neither safe nor idempotent. This means that methods GET, PUT and DELETE are excluded from asynchronous handling. Therefore, only POST requests MAY be handled asynchronously.

«root»/NAs/«NAsegment»/status/«id»

A status resource, resulting from an asynchronous operation.

Extensions

Method Spoofing

The service allows users to use the HTTP/1.1 POST method instead of all other HTTP/1.1 methods, by specifying a _method query parameter in the request URI. Method spoofing is commonly used in the following cases:

  1. To perform HTTP/1.1 GET requests where the total length of all query parameters is too long to fit into a URI. Although there are no theoretical limits to the length of a URI, in practice many clients and servers have practical limits, often as small as 64k bytes.
  2. To perform any method, other than GET or POST, from within a browser. Unfortunately, most modern browsers only support the HTTP/1.1 GET and POST methods. So in order to DELETE a resource from within a browser (which is a perfectly reasonable use case), the request will have to be spoofed.
  3. To perform any method, other than GET or POST, from behind a firewall that only allows GET and POST requests.

Examples

The following two HTTP/1.1 requests are semantically identical:

DELETE /some_resource HTTP/1.1
Host: handle.sara.nl
Date: Mon, 09 Sep 2008 08:17:35 GMT
POST /some_resource?_method=DELETE HTTP/1.1
Host: handle.sara.nl
Date: Mon, 09 Sep 2008 08:17:35 GMT
Content-Length: 0

In XHTML, this request could be interfaced with a “delete button”, like this:

<form action="/some_resource?_method=DELETE" method="post">
    <input type="submit" value="Delete some_resource"/> 
</form>

If you spoof an HTTP/1.1 GET method, and the MIME type of the request body is application/x-www-form-urlencoded, then query parameters of the request body are treated as if they are “GET parameters”. For example, the following two HTTP/1.1 requests are semantically identical:

GET /some_resource?param=value HTTP/1.1
Host: topos.grid.sara.nl
Date: Mon, 09 Sep 2008 08:17:35 GMT
POST /some_resource?_method=GET HTTP/1.1
Host: topos.grid.sara.nl
Date: Mon, 09 Sep 2008 08:17:35 GMT
Content-Type: application/x-www-form-urlencoded
Content-Length: 11

param=value

Header Spoofing

The service allows the user to pass HTTP/1.1 headers as query parameters. This is done to allow any kind of request from within a browser. This feature is provided strictly as a workaround for current web-browser limitations.
To specify an HTTP/1.1 header as a query parameter:

  1. replace all dashes "-" in the header name by underscores "_";
  2. convert all characters in the header name to lowercase;
  3. prepend the header name with "_http_".

Examples

The following two HTTP/1.1 requests are semantically identical:

PUT /some_resource HTTP/1.1
Host: handle.sara.nl
Date: Mon, 09 Sep 2008 08:17:35 GMT
If-None-Match: *
...
PUT /some_resource?_http_if_none_match=* HTTP/1.1
Host: handle.sara.nl
Date: Mon, 09 Sep 2008 08:17:35 GMT
...

Note how the If-None-Match header is specified as a query parameter in the second case.

POST-Once-Exactly

The POST method is neither safe nor idempotent. This poses a problem to the client: if an HTTP-request from the client is not answered by a (correct) HTTP-response from the server, the client has no way to determine if the request processed successfully or not. To circumvent this limitation, server implementations are encouraged to implement POST-Once-Exactly POE.