Capability-based programming is designed to meet the unique requirements of device control scenarios, such as the Internet of Things (IoT)[1], where complex behaviors are best modeled as collections of small concise behaviors that reflect the "atomic" features of the "thing". The basic principles of Capability-based programming have been around since the inception of the interface, and technologies like COM, DCOM, and Corba. However, the scope of identity for each interface was previously limited to the development project, organization and/or development platform.
The Origins of Capability-based Programming
Capability-base programming was originally developed to support the unique requirements presented by device control problems, such as the control of media & devices over Internet Protocol networks and the Internet of Things[1]. It took its roots in the work generated from the Society of Motion Picture and Television Engineers (SMPTE) Ad hoc Group on Media & Device Control over IP Networks. Now you might be thinking "SMPTE? Really?", but it makes perfect sense when you look at the unique challenges faced by the Professional Media industry and its need to orchestrate complex workflows with heterogeneous systems. Back in 2010, before the Internet of Things[1] became a widely accepted term, the SMPTE began an effort to develop a new standard for the vendor agnostic control of media centric devices over Internet Protocol (IP) networks. Prior to this, standards did exist for the control of media centric devices using serial communications protocols, such as RS-422 and RS-232, but efforts to transpose these serial protocols to new IP-based systems had failed.The first task of the newly created "SMPTE 34CS - Adhoc Group on Media & Device Control over IP Networks" (34CS MDCoIP) was to determine why previous attempts at transposing, historically successful, serial protocol standards into new IP-based standards had failed. What 34CS MDCoIP had found was that the traditional, rigid, object models used to create the new IP-based protocols were their Achilles heels. Unlike their serial counterparts, these new IP-based protocol standards were designed using Object Oriented Programming (OOP) and Object Oriented Design (OOD) techniques, resulting in protocols that vendors could not extend without breaking interoperability. This is due to the fact that object extensibility requires that an object's consumer has access to the object's base definition plus the definition of the extensions in use. This works well for object consumers within the same memory space, but when serialization is involved, the object's consumer might not have access to both the base object definition and its extensions. In fact, this is almost always the case with standardized protocols and therefore, the object's consumer will likely NOT be able to deserialize the objects that have been extended. This left vendors with two choices, create their own proprietary protocol or implement a protocol that makes them look and act just like everyone else; since the later choice would eliminate the vendor's ability to add functionality and impact their bottom-line, they went with the obvious choice and implement their own proprietary protocols.
Armed with this knowledge, the 34CS MDCoIP ad hoc group took one of the object models designed to represent professional media devices and inverted it. What they discovered was that, at its core, every object is composed of groups of attributes, operations, and signals (aka. events) that work together to define concise features for that object. These features, defined as "capabilities", can be described as distinct interface definitions and can in turn be assembled, like building blocks, to represent complex behaviors that can be changed dynamically. Thus allowing objects to change their behavior, or mode of operation, on the fly, at runtime. The object's consumer is then able to work with the capabilities that it understands, while ignoring the capabilities that it does not, providing the protocol extensibility that the rigid object models lacked. Capability-based programming was born.
The Internet of Things (IoT)[1]
The Internet of Things (IoT)[1] refers to an Internet-like structure, or network, where each device and its virtual representation is uniquely identifiable and accessible. As more and more devices become network-attached and are made "smarter", the desire to manipulate those devices increases, making the IoT the next logical step for the Internet and similar network platforms. Smart Homes are a superb example of this phenomenon, who wouldn't want their house to unlock the door for them when their hands are full, notify them if something goes wrong, or even set the temperature and mood music, upon their entry? But Smart Homes are just the tip of the iceberg; there are many applications for the IoT in business and professional settings. For example, the Professional Media industry requires precise coordination between many disparate devices, at every stage of the media production process, in order to implement automation and rights management, all the way down the chain to the consumer viewing of the media on a television or a mobile device.Decommissioning the Object
In the days before Object Oriented Programming, program code and data were stored separately, the code having full access to the data. Programmers had to be ever-mindful of where and how the data was manipulated. Then Object Oriented Programming was invented and the data and the code to manipulate that data could be encapsulated within the same programmatic unit, called an Object. For the first time programmers could control how and from where the data was manipulated. Object Oriented Programming simplified development by allowing programmers to break programs into components, with each component behaving in a defined fashion and implementing a specific set of operations, attributes, and signals. These behavioral definitions became known as interfaces and applications could be designed in a modular fashion, allowing separate teams to develop different parts of a single application. Components became services when programs began to interact with one another over the network and/or through channels within common hardware, and eventually the term Service Oriented Architecture (SOA) was coined.Interfaces are used to define the public facing behavior of a service. With each interface fully describe the service, aggregating all of its capabilities into a single interface. This capability aggregation works well with services in homogeneous environments, but in heterogeneous environments such as the Internet of Things[1], this aggregation of characteristics not only becomes unmanageable, but becomes impossible to implement. The solution, don't aggregate the service's capabilities into a single interface, make the service implement many interfaces, each representing a single capability and allow the clients to access each capability independently from the others. Thus allowing services to change their capabilities dynamically, as needed. When clients are written to work with sets of capabilities, as opposed to service interfaces, those clients will not require change as new capabilities are added to services. In addition, those clients will begin to understand and work with new services, as those services implement capabilities they understand. For example, the newly invented Television Coffee Maker, television clients can tune the channel, while coffee maker clients can brew coffee. If the manufacturer later decides to add an "Ice Coffee" feature, they simply add the "Add Ice" capability and clients that know how to "Add Ice" can add ice, while other clients are not impacted. And all without a single change to any of the clients within the network.
The Requirements of Capability-based Programming
The following lists enumerate the requirements of Capability-based programming:Interface Requirements
- A “practically” global unique identifier shall identify each interface.
- Each interface shall specify the authoritative source that maintains the definition of the interface. The authoritative source may be indicated by the interface’s identifier.
- Each interface endpoint shall be independently accessible from all other interface endpoints exposed by the service, device, or object.
- Each interface should define a contract of behavior for a concise feature, function, or capability; implementing the minimal set operations, attributes, and signals required to implement a concise feature, function, or capability. The smallest unit of control.
- The documentation and programmatic artifacts for each interface should be obtainable, over the Internet, using the interface’s unique identity.
- Interfaces meeting these requirements shall be known as capability interfaces.
- A “practically” global unique identifier shall identify each service, device, and object.
- Services, devices, and objects shall implement one or more capability interfaces.
- Services, devices, and objects shall only be accessed via their capability interfaces.
- Services, devices, and objects shall provide a means by which the set of exposed capability interfaces can be identified, iterated, and/or listed.
- Services, devices, and objects may change the capability interfaces they expose, on the fly, at runtime, to support different "modes of operation."
What is a "Practically" Global Unique Identifier?
A "practically" global unique identifier consists of two parts, the Authority and the Value. The Authority is the organization, algorithm, and/or registry by which the uniqueness of the Value is guaranteed. For example, an identifiers Authority may be an organization such as the IANA, an algorithm such as the one specified in the UUID specification [IETF RFC 4122], or it may be a registry such as the Registrars assigned by the Domain Name System (DNS). The Value may consist of any byte pattern designated by the Authority."Practically" global unique identifiers may also be constructed from multiple parts. For instance, a namespace combined with a name, unique to that namespace, may be "practically" globally unique, if the namespace is prefixed with a legitimate, registered, DNS domain name. For example, the interface name "net.posick.SomeInterface" is "practically" globally unique if the registrant of the "net.posick" domain name provides guarantees that the interface name shall be, and shall remain, unique. This type of identifier provides the least guarantee of uniqueness, unless an appropriate Trust Framework[3] is established, defining a governing organization, registry, and/or algorithm e.g., the ISO, the IANA, the SMPTE, or IETF RFC 4122.
How do Unique Identifiers Apply to Interfaces?
Unique identifiers apply to interfaces in every Interface-based application ever built. It might not be apparent to the casual observer, but each interface must have a name that is unique to the application, development project, and/or organization that is responsible for the development of that application. The compiler requires this uniqueness of the name, in order to link to the proper programmatic artifacts. Namespaces were developed to help mitigate the issue of unique naming and have been used since to guarantee the uniqueness of interface names at a larger scale. It is a little recognized fact that most developers tend to make efforts to guarantee the uniqueness of their interface names, at a global scale, by using namespaces that are unique to the application, project, and/or organization and application, project, and/or organization names that are "practically" globally unique. In fact, most modern programming platforms, such as Java, .Net, and XML encourage the use of registered domain names as namespace prefixes, thereby helping to guarantee the "practical" global uniqueness of the interface names, as the namespace prefix is guaranteed to be globally unique by the DNS Registrar and the namespace suffix and interface name are chosen to be unique within the namespace prefix by the organization and/or development team.Who or What is an "Authoritative Source?"
An Authoritative Source is an entity (e.g., an organization, a person, or a registry) that maintains an interface and guarantees the uniqueness of each interface within its scope. The scope of the Authoritative Source can be identified by a domain name, a namespace, or combination thereof.How are Capability Interfaces defined?
An Authoritative Source is an entity (e.g., an organization, a person, or a registry) that maintains an interface and guarantees the uniqueness of each interface within its scope. The scope of the Authoritative Source can be identified by a domain name, a namespace, or combination thereof.How do Capability Interfaces Differ from Regular Interfaces?
Capability interfaces differ from regular interfaces in 2 ways.- Capability interfaces have a "practically" globally unique identity.
- Capability interfaces represent a concise feature or function, aka. a capability.
A concise feature or function can be thought of as "the smallest unit of control" or the minimum number of attributes, operations, and signals needed to implement a specific feature. For example, a temperature sensor might expose a temperature value that can be read by consumers. An implementor might choose to create a "Read Temperature Value" interface or they might decide to be more generic and implement a "Read Float Value" interface. The choice is intentionally left up to the implementor for flexibility.
Who can define new Capability Interfaces?
Anyone who can define an interface and guarantee the uniqueness of its name or associated identity can create capability interfaces. In other words, anyone who can register their own domain name or has the authority to create unique names within a domain can define capability interfaces. It's not hard to guarantee "practically" unique identity, we do it all the time by using our registered domain names as part of our namespace names.Does Capability-base Programming Support Traditional, Legacy, Interfaces?
Absolutely, 100%, unequivocally, Yes!To convert a regular interface into a capability interface, all one needs to do is assign a "practically" globally unique identity to that interface, if it does not already have one.
Can the Interface's Namespace and Name be used for Unique Identity?
Yes, however, if there are multiple versions of the same interface, each version requires a unique identity of its own. This may be managed with the namespace or it may be managed by associating a completely independent identifier to the interface and a specific version of that interface via documentation or via a registry. The identifier may be a GUID, UUID, URN, Fully Qualified Namespace Name, or any other identifier type or format that is capable of guaranteeing uniqueness.Can Capability Interfaces Inherit From or Extend other Interfaces
Yes.A capability interface is no different than any other interface, except that they should represent small concise features or functions and they must be "practically" globally uniquely identified. Please note that this identity may be specified in documentation or by the interface namespace and name.
Can a Capability Interface Depend on other Capability Interfaces?
Yes and No.A capability interface should define all of the attributes, operations, and signals required to implement a concise feature or function, and therefore, should not depend upon the existence of another interface. But in a service, device, or object, each capability interface interacts with the same state and therefore, for the specific implementation, a capability interface may depend on other capability interfaces to assist in the proper manipulation of state.
For example, a media device that can load and play media may require 4 capability interfaces ("Load", "Eject", "Play", and "Stop") that depend on one another to implement a rudimentary play-out application. For the specific implementation of this service, each interface depends on the others, at runtime, to offer their feature or function, but the interfaces do not have a static dependency upon one another. In other words, the interface definitions do not depend on any other interfaces. The dependencies are contingent on the device, service, or object implementing them and how it requires its internal state to be manipulated.
Service Oriented Architectures (SOA) vs Capability-based Programming
Capability-based programming can be thought of as a new way of defining and assembling interfaces to describe service behaviors. Instead of creating a single, monolithic, interface that represents the full behavior of a service, the behavior of the service is described by using many smaller, well-understood, interfaces, known as capability interfaces. These capability interfaces are in turn used by services in a Service Oriented Architecture to construct complex behaviors much like bricks and mortar are used to build houses.Isn't Capability-based Programming just SOA repackaged?
No, Capability-based programming can be used to extend the capabilities of a Service Oriented Architecture (SOA), by allowing complex service behaviors to be constructed from many smaller, well-understood, well-defined, behaviors. But Capability-based programming can be used with any language, platform, and/or paradigm that supports the notion of an interface. SOA is not needed nor implied.In other words, Capability-based programming complements SOA, by using smaller, well-known, interfaces as "building blocks." Consumers that understand the smaller "building blocks" can interoperate with new services, without change, and consumers can even work with services that only implement some of the "building blocks" that the client understands. For example, Television Coffee Maker clients can tune the channels of regular televisions or brew coffee on regular coffee makers. Flexibility galore!
Example of Capability-Based Programming in Java
The following example illustrates the flexibility and easy in which Capability-Based Programming methodologies can be introduced into existing programming languages, such as Java. Due to Java's strongly typed nature, type-casting must be used extensively within clients, however, language extension can be defined to remove this need.
The following class definition defines a simple thermostat interface, defined using traditional object modeling methodologies. Given this is a remote protocol (RMI), many Object Oriented Design principles do not apply, as messages are exchanged across process boundaries to facilitate method execution.
    public interface Thermostat
    {
        public enum MODE
        {
            HEAT, COOL, AUTO;
        }
        
        public float getTemperature();
        public void setTemperature(float temperature);
        public MODE getMode();
        
        public void setMode(MODE mode);
        
        public int getZones();
        
        public String getZoneName(int zone);
        
        public void setZone(int zone);
        
        public String[] getZoneNames();
    }
So what is the problem?
What if the manufacturer wishes to add new capabilities to the thermostat? Lets say the manufacturer wishes to add the inputs from a remote weather station.
Traditionally, the manufacturer would be required to define a new interface that extends the Thermostat interface or define a new Interface altogether. Once this is done and the thermostat is upgraded, legacy clients, and client that just understand the Thermostat interface, can no longer interoperate with the upgraded thermostat, as the binary signature and/or serialized name of the remote interface have changed. Clients must be upgraded before they can interoperate with the upgraded thermostat and anyone familiar with upgrading client software can tell you that the more clients you have, the more difficult it is to update them and worse, we can't upgrade all clients, we can only upgrade the clients that need to interoperate with an updated thermostat. If a client needs to talk to both upgraded and un-upgraded thermostats, it will need two client software installations. This is bad, very bad!
Capability-Based Programming to the rescue. With Capability-Based Programming this is a non-issue, as each capability is defined with a distinct, concise, feature interface. As new capabilities are added, the device simply implements these new interfaces and provides the client with a list of interfaces it implements. The following example uses Capability-Based Programming methodologies to solve the above problem, note that each capability, including the ability to change a state value, is implemented as a distinct interface, this also simplifies security rules, as these rules can be applied at the interface level.
    public interface Device
    {
        public String getID();
        
        public String getName();
        
        public String[] getCapabilities(); 
    }
    public interface TemperatureSupport
    {
        public float getTemperature();
    }
    
    public interface ConfigTemperatureSupport extends TemperatureSupport
    {
        public void setTemperature(float temperature);
    }
    
    public interface ZoneSupport
    {
        public int getZones();
        
        public String[] getZoneNames();
        
        public String getZoneName(int zone);
        
        public void setZone(int zone);
    }
    
    public interface ConfigureZoneSupport extends ZoneSupport
    {
        public void setZoneName(String name);
    }
    
    public interface ModeSupport
    {
        public enum MODE
        {
            HEAT, COOL, AUTO;
        }
        
        public MODE getMode();
    }
    
    public interface ConfigureModeSupport extends ModeSupport
    {
        public void setMode(MODE mode);
    }
In the above example, each capability of the thermostat is defined as a discrete interface. There is also a new interface defined, named "Device", that is common to all devices and implements the base programatic requirements of Capability-Based Programming and the Internet of Things (IoT), implementing the device identifier required by the (IoT), the list of implemented interfaces required by Capability-Based Programming, and a human-readable name, to act as a Business Key, for proper data modeling. With the Device interface, clients need not interact with any other exposed interfaces to know what device they are working with and what the device is capable of.
The clients in this example may utilize static configuration or some service discovery mechanism, such as DNS-SD to discovery the Device interface endpoint and the means by which to connect to it. After connecting to the Device interface endpoint, the client then uses the list of supported capability interfaces, "Capabilities", to load and construct an Invocation Handler Proxy that can be used to execute the methods implemented by the interfaces exposed by the device. This example assumes that each capability interface is exposed through the same endpoint (URL, and/or Address & port) as the Device interface, such as is the case with Java RMI, and therefore a URL or other connectivity data is not required for each capability interface. If however this is not the case, a URL attribute may be added to the Capability class definition to add support for different interface endpoints, such as the case in SOAP-based Web Services.
Using these techniques, new Capabilities can be added without impacting the interoperability of client applications. Additionally, any client that understand how to work with a thermostat, a weather station, or both, can interoperate with the upgraded thermostat, not just clients specific to that vendor's device with the appropriate software update.
References:
1. Wikipedia.org: Proposed by Kevin Ashton in 2009, The Internet of Things refers to uniquely identifiable objects in an Internet-like structure.
2. SMPTE.org: SMPTE ST2071 Media & Device Control over Internet Protocol Networks.
3. OpenIdentityExchange.org: What is a Trust Framework?
Special Thanks to Kai Kreuzer, Kenneth Melms, Theodore Szypulski, Rob Hunter, and the SMPTE for their help in the making of this article.
This article is available for modification and reuse under the terms of the Creative Commons Attribution-Sharealike 3.0 Uported License and the GNU Free Document License.
