This book is pre-release and is an evolving work-in-progress. It is published here for the purposes of gaining feedback and providing early value to those who have an interesting resource oriented computing.
Please send any comments or feedback to: email@example.com
© 2018 Tony Butterfield.
All rights reserved.
An architectural style is defined by a set of constraints over an anarchic, anything goes, architecture. Roy Fielding famously used this approach to define the architectural style of the World Wide Web in his thesis dissertation1. The unconstrained architecture is called the Null Architecture, and Fielding used this as the basis to apply six constraints which results in the Representational State Transfer (REST) architectural style.
The REST architectural style in term forms the basis for the resource-oriented style, so it makes sense for us to look briefly at its six constraints first.
Client-Server - This first constraint separates an architecture into two communicating parts - the client and the server. This separation, usually over the network, allows both parts to evolve separately. The client initiates any interaction, usually acting as an agent for some user. As such it may contain user interface elements and technologies. The server contains the state that the client interacts with.
Stateless - Statelessness in this constraint applies to the conversation rather than absolutely. Stateless communication requires that the server can expect to retain no prior knowledge of any client, or of their previous interactions. To implement a meaningful, multiple stage conversation, all relevant state must be passed to the server on each request. Stateless interaction allows the server to be more robust to failure, and more scalable to the number of clients.
Cache - The caching constraint requires that the response contains the necessary information to determine if and how that response can be cached.
Uniform Interface - This fourth constraint is perhaps the most distinguishing feature of the REST architectural style, it can be broken down into four sub-constraints:
Layered System - This constraint allows layers of intermediaries to be placed between client and server. These intermediaries block direct communication and provide capabilities such as security control, caching, and load balancing.
Code-On-Demand - When a client receives a representation of unknown type it can obtain code from the server that allows it to interpret that representation. This allows a minimal base functionality in the client that can be augmented with new feature sets dynamically over time.
Now we have established a foundation from the REST architectural style, we shall move on to the twelve constraints of ROC.
Recursive Decomposition - REST treats the client and server components as the end of the road for the architecture - the point where the abstraction ends. The abstraction stops at the bottom in a server that is implemented in code. At the top end, the abstraction terminates when a user agent interfaces to human or, when in an application, with code again. Code is considered the end of the abstraction because it is opaque. Even if a server is implemented to issue HTTP requests, as is often the case with microservices3, these requests are logically decoupled from the originating request from the perspective of the architectural style.
In ROC things are different: a server can act as a client, issuing sub-requests. Conversely, a client may have been invoked by receiving a request rather than merely acting as a user agent, in this case, it can be considered a server. In essence, there are just endpoints - which may act in the roles of client and or server. Typically there is a root request that is issued by a transport endpoint in response to an external (to the abstraction) stimulus. This initiates what emerges, recursively, as a request call tree, as each request is handled by an endpoint, and then issues zero or more sub-requests.
General Purpose Representations - The REST architecture uses the TCP/IP network as the messaging middleware between endpoints. This results in all representations being passed as a stream of bytes with a Content-Type header to define how to interpret those bytes. In ROC, when endpoints co-exist on the same physical machine, a representation can be any immutable data structure or object. However, not all representations are equal. It is best to minimise the number of types of representation. So for example in a human resource system, we might have resources that represent employee, role, location. In an object-oriented system, it would be typical if each of these was represented by an object with getter methods on. In ROC we could better represent these with a general purpose data type such as XML or JSON. The advantage is less code is needed. Firstly we don’t need to write the datatype in the first place. Secondly, datatypes such as these are extensible. We can add new fields to them without modifying them. Thirdly we can use standard technologies to process them. These toolchains of accessors and transreptors eliminate code further. Other examples of general purpose data types include RDF graphs, Arrays, Maps, even strings - look how far the Unix toolchain got with mainly line and tab-delimited text. Of course, there is occasionally a reason for custom representation, for example, to provide optimised queries into the contained data. However, in general, use should be avoided without a compelling reason.
Scope Outside Programming Language - Scope has traditionally been a programming language construct that allows modularity while balancing the need to share state from different locations within a program in a controlled way. The World Wide Web has a flat address space which makes all resources, exposed by servers, available to all clients. (Real world networks impose necessary limits on this through the use of private networks and firewalls.)
By specifying scope when a request is issued, ROC frees the scope concept from programming languages and liberates it to the architectural level, allowing clients to define a scope in which to resolve resources, and for that scope to determine what resources should be available to a service if it issues sub-requests. On top of this, more sophisticated patterns can emerge such as the dynamic insertion of ephemeral address spaces into scope to model pass-by-value parameters.
Identifiers as Function Invocation - Resource identifiers structured in such a way as to resolve to an endpoint implementing a computable function, and with arguments as nested resource identifiers, are isomorphic to functional invocation. Using a finitely nested resource identifier, a whole functional program can be defined4.
Hide the API - In NetKernel there are many modules providing integration to technologies. All of these technologies are exposed as endpoints (accessors, transreptors, or transports) which work with resources. The author of those modules has usually adapted the code API to provide its functionality with a uniform ROC interface such that it can receive requests and return general purpose representations. Of course, to do this takes more code so why bother? Firstly that code only needs to be written once and once written that technology now is plug and play. We can use it with much less knowledge of the intricate details of its operation. I’m not sure that point is obvious. Why should the ROC interface be simpler? There tends to be wildly varying styles of API design, some better than others, threading issues, object models with lots of partially documented methods. A uniform interface which ideally covers a pragmatic 80% of use cases can be used with minimal knowledge and without writing code. In the case that a module doesn’t expose an obscure feature you really need, that module can be enhanced still preserving the benefits of the approach.
Kernel as Middleware - The ROC kernel takes the central role in a system acting as the intermediary for all requests - resolving and scheduling them. This is somewhat similar to an operating system. Sometimes software frameworks can act as a central coordination point in a software system, but the ROC kernel is not a framework. It does not impose any data architecture patterns or dictate the choice of technologies. The kernel and the ROC abstraction it embodies are uniform. Endpoints roles and capabilities are defined by their place in a data architecture not their place within the abstraction.
Language Runtime - Programming languages are used to implement algorithms and to orchestrate data-flows within systems. Usually, a programming language takes THE central role in controlling the operation of a system. In ROC, however, the kernel has central control, and programming languages are used to implement the internals of endpoints. As in UNIX where C is the native programming language, within NetKernel, Java and Java bytecode generating languages are native. However, just as languages such as Perl and Python can be used in UNIX, in ROC programming languages with a suitable implementation can be encapsulated as an endpoint called a language runtime. A language runtime is always passed a program to run as an argument when it is invoked. A language runtime is stateless as it is passed a program to run and any other inputs each time it is invoked, in just the same way that usr/bin/perl is, for example.
Composition of Architecture - Taking the Layered System constraint further, ROC allows the composition of data architecture. Usually, this is with the category of endpoint called overlays. Approaches such as Enterprise Integration Patterns provide a similar approach but are more limited, for example by only providing asynchronous uni-directional messaging. In ROC all the use cases of the Layered System constraint such as load balancing, auditing, caching and access control can be layered into an architecture at arbitrary places. Also, non-functional constraints such as timeouts, flow control, and tunnelling can be added to architectures with no code an minimal reconfiguration. Configuration driven architecture can define routing and interface adaption which becomes dynamic when resources that change are used as configuration.
Introspection Both static structure, dynamic structure, and runtime state should be exposed as resources, making a system fully introspectable. Through this constraint architecture can be rendered visible to tooling and the system can be designed to be adaptive where needed.
Discovery Through Probing - Functionality distributed in a system can be discovered by issuing probing requests into spaces and observing their response. Discovered functionality can be rolled up into a registry resource which can be cached. It is refreshed when functionality changes because dependencies are captured through the response expiry dependency model.
Trans-representation - To decouple representation typing between client and server it is important to allow both to use whatever form is appropriate and natural. As part of the request evaluation process, the ROC kernel will adapt representations from what is offered to what is desired using a process called trans-representation. Trans-representation is performed by transreptors - endpoints which resolve requests with the TRANSREPT verb. The kernel orchestrates TRANSREPT sub-requests transparently.
Pull not Push - When recursively decomposing functionality into resources, a developer is often presented with a choice between obtaining state to pass to a sub-request or providing a reference to that state - in essence, the choice is to pass a representation or to pass a resource identifier. When a choice is available, a developer should pass a resource identifier - this has multiple benefits. To understand why we must look at how to pass a representation by value in ROC: a representation must be stored as a resource in a transient space and injected into the request scope. This creates a minimal overhead but has some implications. Firstly, if the receiving endpoint does not use that representation, then the work to reify and pass the representation is wasted. Secondly, the receiving endpoint can request the actual type of representation it wants to receive, avoiding the need for the client to second guess. This avoids any double transreption. Thirdly the argument keeps it’s identity so any work to transrept it can be cached.
It is worth noting that true self-describing messages are impossible. Any message requires a suitable interpreter. See Peirce's theory of signs↩
Active URI IETF draft↩