A Bit of API History

“History is the torch that is meant to illuminate the past, to guard us against the repetition of our mistakes of other days.”  – Claude G. Bowers


In my journey to uncover the essence of a great API, I previously defined “What is an API?” as “a way for one software component to access another for function and/or data sharing through a well-defined agreement”.  The objective of any API is to share – to increase productivity through reuse, manage risk through isolation or create new value by integrating processes and data.  In this post, I will explore a brief history of APIs, with a specific focus on distributed APIs, to highlight motivators for each API type I previously identified in the “Types of APIs” post and how we have arrived at our current state of industry so that we may better understand opportunities and implications.


A bit of history

The latest buzz is around Web APIs, but APIs are not a new concept nor unique to modern distributed computing.  APIs have been used with procedural languages since well before personal computing and were typically delivered as libraries (first used in the 1960’s).  The separation between the user and implementation of function was achieved, but libraries still created strong dependencies between the API consumer and the API producer.  In some cases, this was OK where publication was limited to a few consumers or libraries were bundled in larger dependencies such as operating systems, but conflicts quickly arose in complex applications and changes to APIs could not be quickly distributed to a growing number of consumers.  Additionally, library based APIs are sufficient for functions and routines, but could not provide sharing of data and resources.


1970’s – 1980’s

Communication between computers became more common in the 1970’s and early 1980’s as distributed systems began to emerge.  Techniques arose that allowed the procedural API to be accessed remotely while avoiding the typical overhead required of programmers by creating all the transport plumbing and data handling (data packing and unpacking) necessary to address the interoperability needs of communication across different types of computers.  These Remote Procedure Call (RPC) systems often used meta-data in the form of Interface Definition Languages (IDL) to express the procedure payloads and interaction.  This allowed the procedure and data handling to be based on a common interface by generating the code from the IDL for the specific machine at each point of communication.  RPC was used to support critical system intercommunication such as Sun’s Network File System (NFS) and database access in client-server models as well as directly by business systems.  RPC allowed communication across computers, but depended upon all computers being available to complete operations with failure recovery left to the consumer.

Alternative techniques based on Message Oriented Middleware (message passing and queuing) also arose during this period to address specific Enterprise Application Integration (EAI) needs by first providing bridges to legacy mainframe systems.  The most notable of these systems was IBM MQSeries that brought proprietary mainframe messaging technologies out of the IBM world.  These message passing and queuing systems directly addressed many of the problems associated with distributed computing, such as assuring delivery of the messages (without loss) by breaking point-to-point dependencies via intermediaries in the form of “queues”.   Queues act as a component where lists of messages can be stored so the consumer and the provider can act on the messages at different times.  This allowed communication to be preserved, even if computers had different availability.  Message passing also depended upon message payload definitions to manage translations and transformations across machine types and systems to ensure interoperability.  This virtual network of message passing requires significant software and hardware infrastructure.


1980’s – 1990’s

Through the late 1980’s and into the early 1990’s, object-oriented programming (OOP) emerged out of academia to support a new paradigm for organizing complex applications into “objects” which encapsulated data and sets of procedures that operated on that data.  APIs in OOP are classes representing defined interfaces to a set of functionality bounded by a context, set of behaviors and identity (instance) of the object.  Distributed systems became even more prevalent through the 1990’s as advanced client-server topologies and the emergence of the commercial WWW.  Again, techniques arose within the OOP world based on the RPC model to allow remote access to object instances such as Common Object Request Broker Architecture (CORBA) for open systems and DCOM from Microsoft.  Like RPC, these approaches also used IDL to provide a common interface contract between the server and the client which was used to generate “server skeleton code” and “client stub code” that abstracted away the underlying transport protocol and data handling.  This provided independence from the transport and data handling issues, but created a critical code dependency on the interface meta-data (IDL) as well as similar middleware infrastructure needs that were often complicated and cumbersome except in very isolated systems.

Out of this hodge-podge of core needs, rotating architectural strategies and solutions, many lessons were learned.  One of the most critical was the realization that you could not forget that distributed computing was still in play, even if you nicely hid the transport overheads from the programmer.  In the 1990’s, several Sun “Fellows” defined the eight “Fallacies of Distributed Computing” that they contend were made by programmers new to distributed computing.  These fallacies are still very relevant in our world today (as every web application is a distributed computing problem) and must be considered for any remote access.  These points drove home some critical findings practitioners were making in the various approaches such as:

  • Difficulty in managing transport layer configurations
  • Difficulty in managing/navigating transport layer security
  • Fragility in communication across long lived states (such as remote object access)
  • Sub-optimal interoperability support across different computing systems
  • Sub-optimal performance due to high volumes of calls to functions (chatty) and fine-grained operations (“multiplier” affect)

Martin Fowler summed up these realizations nicely in his “Patterns of Enterprise Application Architecture” with: “First Law of Distributed Object Design: Don’t distribute your objects.”


Late 1990’s

In the late 1990’s, many practitioners began to leverage the ubiquity of new WWW capabilities such as HTTP, first to tunnel the previous RPC and remote object techniques to avoid transport layer management issues, then taking full advantage of mark-up capabilities with the parent of HTML, XML, to define RPC calls.  While this transformation was occurring, others saw the need to create a different architectural style and standards to support the proper functional granularity and non-functional practicality for distributed computing in the form of Service-oriented Architecture (SOA).    The Open Group defines SOA as:

  • Service-Oriented Architecture (SOA) is an architectural style that supports service-orientation.
    • Service-orientation is a way of thinking in terms of services and service-based development and the outcomes of services.
      • A service:
        • Is a logical representation of a repeatable business activity that has a specified outcome (e.g., check customer credit, provide weather data, consolidate drilling reports)
        • Is self-contained
        • May be composed of other services
        • Is a “black box” to consumers of the service

Standards were established over several years under the guise of WS-* (a notation indicating many Web Service specifications) based on the learning of EAI solutions and past distributed computing practices.  These standards depended upon a meta-data description of the service interface and payloads via the “Web Services Description Language” (WSDL) and SOAP (previously known as “Simple Object Access Protocol”, but dropped with the 1.2 standard) to describe the communication protocols.  Some driving criteria were that the message payload would be expressed as XML and that the service be defined as meta-data that was both machine and human readable.  Frameworks were built to leverage the meta-data to generate adaptors, like its IDL predecessor, to enforce the contract.  Over time, the complexity of the standards grew in an effort to provide a one-size-fits-all palette of options.  The weight of these standards and the overhead created by the code dependency on the service meta-data again led to a complicated standard, fragile dependency set and burdensome maintenance effort.


2000 – Current

Roy Fielding published a description of the Representative State Transfer (REST) architecture style in his 2000 dissertation.  Fielding was an early leader in the evolution of the WWW and had participated in many of the Hypertext Transfer Protocol (HTTP) standards that drive its data communication, including the monumental 1.1 version of the standard that is in place today.  REST was the architectural style he used to govern those updates.  Fielding’s architectural description and the emerging definition of the Semantic Web by Tim Berners-Lee reminded us that access to information was the original intent of the Internet and the concept of resources was at the center.  Practitioners soon latched onto the notion of Resource-oriented web services based on the REST architectural style.  This allowed them to hang their hats on the existing WWW conceptual model to counter the clumsy and complicated attempts of the past while leveraging the same technical WWW infrastructure to replace the overweight middleware spawned from the late 20th century.

This brings us to the current “Web API” approach that has been rapidly gaining favor and ground over the past 10-12 years.  The Programmable Web has been providing a repository and tracking published APIs since 2005.  They report that these public APIs have gone from nearly non-existent in 2005 (with the exception of a few early notables such as Salesforce, eBay, Flickr and Amazon), to over 12,000 as of January 2015, and growing steeply since 2010.


Growth in Web APIs



The potential value of APIs has long been known, but the technology limited realization of many of these opportunities in a distributed fashion until modern web-based approaches became available.   We must remember that while techniques are emerging to reduce the friction of distribution of data and function, other culprits may affect the usability of APIs such as just being able to find the right actions within an API and understanding of the API.  We will continue to explore these culprits, and others, in future posts as we seek to understand what makes a great API.