Automatic Path Creation Service - APC

This document explains the architecture of the APC (Automatic Path Creation Service), which is a component within ICEBERG to establish data flows between heterogeneous communication endpoints by composing appropriate transoding operators. Call agents of the ICEBERG artchitecture make requests to the APC service for data path creation by providing the required endpoint information. APC service provides a clean separation of the data paths from the control paths by encapsulating the data path creation and instantiation process.

Architecture of APC
API to APC explained
How to add operators and connectors
How to run the APC service
Known caveats/limitations to APC
Javadoc to APC

Architecture of APC:

Terminology:

This section describes the terminology used in the following sections.

Operators

An operator is a unit of computation on some data. operators are strongly-typed: they have a clear definition of the input and output types. In addition, operators have various attributes such as communication protocol, computational requirement, static data input (e.g. a database) etc. Only based on these attributes, is automatic composition of services possible.
Operators can have multiple inputs and outputs (i.e. aggregators, broadcasters). Communication using multicast is one way to deal with this.
Operators have only soft-state: this means that if an operator fails and is restarted, there is no need to recover its state information. Application level protocol is needed to provide guarantees such as reliability, ordered, end-to-end data delivery.

Connectors

A connector is an abstraction of the Application Data Unit (ADU) transport mechanism between two operators (e.g, RTP connector, UDP connector, etc.) Each connector is characterized by a specific transport protocol.

Overview:

APC service is responsible for creating, maintaining, and eventually freeing up data paths consisting of operators strung together by connectors. The overall path construction process can be illustrated through the following diagram:

It is a continuous process of optimization.

Step 1: Logical Path Creation:
Logical path is defined to be an ordered sequence of operators. Operators are registered using the SDS (Service Discovery Service). Each operator has an XML description associated with it, describing its attributes such as input type, output type, communication protocols supported. More importantly, each operator has various cost metrics associated with it. (e.g. computational latency, memory usage, and other application specific QoS metrics). Users provide the goals for the optimization process. Depending on the applications, these goals can vary. Logical path creation process should return a list of possible paths, ordered by the decreasing cost based on the user's input parameters for optimization.
Currently, we simplified the implementation by using only the format type to describe all operators used. Furthermore, currently the APC service does not use any bootstrapping SDS to dynamically discover operators; instead it receives this informatoin by given static configurations beforehand. The search for the optimized path is also not done. A depth-first-search implementation is used to find the first matching path, since the types of paths constructed right now usually only consist of one to four operators. Consequently, such optimizations are unnnecessary.
Step 2: Physical Path Creation:
This step is tightly coupled with the first step. There are two types of operators, those that can be created on the fly by downloading the code, and those that cannot be dynamically instantiated and only existing instances of them can be used. The instances of running operators are again advertised and discovered through the SDS service. Again, for simplification, this implementation only deploys operators that are dynamically created and destroyed after the path has been torn down.
Step 3: Path instantiation, execution, maintenance, querying:
Given the physical path descriptions, the instances of operators are created and data flow is started from the source endpoint. During the lifetime of the path, the APC service actively monitors the liveness of the operators to make sure that they are functional. Any operator can also report problems to the APC service about its neighboring operators, so that the path can be repaired when necessary. The control path plays an important role here, since it is how operator deletion, insertion, repair can be accomplished. Control path is used for exception handling, controling parameters of path components, monitoring and analyzing path performance; thus, it needs to be independent of data paths and be highly robust. A data path can be "walked" using the control path. The APC service definitely has the handle to all the path components. In addition, the control path can also overlap the data path: each path component (operator) can have a handle to its two neighboring operators. The control paths allow the querying of the work status of the operators. In the next release, we plan to implement features that allow one can query application-specific information such as what percentage of the data has been processed already.
Step 4: Path tear down:
Once the path has finished providing the service, or the end users have sent a termination request, the path can be torn down. Resources can be freed. For optimization reasons, the path can be cached and reused in the future, both the logical path description and its physical path description. Thus, the path can be kept without being torn down.

Implementation:

This section describes some of the interesting implementation details and functionalities provided by APC in the current release.

Simplifications of the design:

For the purpose of prototyping, we made a number of simplifications to the above ideal design:

Simplified description of operators: currently all operators are uniquely described using its input and output type. No XML descriptions are used right now. Furthermore, we assume that each operator has a single input and a single output. Operators are created dynamically and destroyed once after their use. APC service is statically configured with the all known operators beforehand.
Process per operator model: each operator instance is a process that is not shared among other instances of the paths.

APC service:

The APC service itself is a Ninja iSpace cluster service. The idea is that multiple nodes will implement the APC service, providing fault-tolerance and load-balancing features. This will require the sharing of data among the nodes. Thus, in this release, we only consider a single-node APC service.

Connection Manger:

This is also an iSpace service that must be implemented by all nodes on which operators will run. Connection Manager has an API that deals with loading, creating, repairing, destroying operators. It is also reponsible for creating and maintaining connections from or to the operators running on the node where the connection manager resides. A connector consists of two objects, each one is part of the two operators in communication. One part is a writer object; the other is a reader object.
For simplicity, the operators created by each connection manager have a process per usage model: operators are not shared among different path instances. Operators are created on demand for each new path instantiated and destroyed after the path is requested to be torn down.
The physical paths created by the APC service always contain the source endpoint operator. In certain cases, it does not make sense to have such an operator; thus, it can be a null operator. This is provisioned by a dummy operator and a dummy connection manager implementing the connection manager interface.

Partial Path Repair (PPR):

At any given time, any operator can fail, or the connection between two operators can be broken. In this release, we support partial path recovery: rather than tearing down and restarting the entire path, failing operators are restarted on new nodes in order to introduce minimal disturbance at the end users.
Failure is discovered by catching the I/O exception at each operator from its neighboring communicating operators. The failing operator is identified by the APC service and restarted at a different node.

Existing implementation of the types of paths:

from MP3 to gsm
from gsm/pcm to pcm/gsm
from text to gsm/pcm
from pcm/gsm to text

API to APC explained:

The API to APC is rather simple and consists only of the following five methods:

int pathRequest(EndPointInfo srcEndpt, EndPointInfo destEndpt)
This function is called to request to have a new data path built from the source point with srcEndpt to destination point with destEndpt. EndPointInfo is a data structure containing relevant information needed for path construction (e.g. data format, IP address and port number etc.) Details are explained in the IAP section. Each path constructed is identified by a unique path ID which is returned to the caller of the method. This is needed as an input parameter of other functions.
int pathRequest(int pathid, String inputType, String outputType, String inputHost, String inputPort, String outputHost, String outputPort, Object argList1[], Object argList2[], String cName, String sessionID)
This function is the same as the above function with the contents of the EndPointInfo explicitly passed in rather than encapsulated in a data structure. The functionality is the same.
void changePathEndPt(int pathId, String newAddress, String newPort)
This function is used to change the destination IP address and port of the first operator of the physical path constructed identified by pathId. This address specifies where the data are sent to. Currently, it is used to change the Vat operator's input argument from unicast to multicast address or vice versa.
void changePath(int pathId, EndPointInfo newSrcEndPtInfo)
This function is a more general implementation compared to the above function to change the first operator's properties of the physical path constructed identified by the pathId. Ideally, the first operator should be changed without restarting the entire path. For simplicity, the current implementation involves tearing the entire path and restarting it. Future release will definitely optimize this implementation.
void pathTearDown(int pathId)
This function tears down the path constructed identified by pathId to free up any used resources.
void repairPath(String opName, String inOutType, int connectionType, int pathID)
This function repairs the path of the given path ID from the given operator's with the name opName's perspective. If inOutType is "in", this means the operator from which the given operator receives input data from is broken. If inOutType is "out", it is the operator from which the output data is received is broken. ConnectionType describes the type of connection being repaired. This is called by the connection manager of the operator noticing its neighboring operator's failure.

How to add operators and connectors to the APC package:

To add operators: If the operator is a unix process command, one needs to extend ProcessOperator class. All operators need to specify its natural input and output block size to help reduce overhead of sending small data over the wire. Otherwise, one needs to extend Operator class to specify its connection mechanism and various other properties of the operator.
To add connectors: If the connector is a stream connector, one needs to create a reader which needs to implement StreamConnectorReaderIF interface and a writer implementing the StreamConnectorWriterIF. Currently, only streaming connectors are supported: UDP and RTP.

How to run the APC service:

In the iSpace configuration file, which is the file passed as argument to ninja.iSpace.Main, one needs to specify this entry to run the APC service:
iceberg.APCpath.services.APC host_file_name
The host_file_name contains the name of the hosts available to run operators. Each host name is separated by a new line; an example can be found here.
On each of the host specified in the host_file_name, a connection manager service or a service that implements the connection manager interface needs to be run. For example, one of the following entries needs to be in the iSpace configuration file on each of those hosts.
iceberg.APCpath.services.ConnectorMgr
iceberg.APCpath.services.DummyCM
iceberg.APCpath.services.VatCM
iceberg.APCpath.services.GsmCM

Known caveats/limitations of APC:

This section documents the known caveats/limitations of APC. Due to time limitations, they are not fixed, but do not affect the functionality of the rest of ICEBERG components. In the next release, these bugs will definitely go away.

Each path assumes that each operator uniquely identified by its class name appears only once. Bug fix: use sequence number and class name together to uniquely identify each operator within a path.
Process per operator model: each operator instance is a process. No multithreading support exists -- operators are not shared among different instances of paths. The result of this simplified model is noticable latency due process execs in real-time applications (text-speech, mp3-gsm paths) and limited scalability in terms of the number of operators that can be supported per node.
Lack of flow control: Real-time applications need to have necessary flow control to buffer or slow down data when necessary. Currently, no generic flow control mechanism exists. Each operator has a known output block size and input block size. The minimal size of output data from any operator is specified by its output block size to limit the overhead of sending data over the wire.
Simplified connector model: The current connector model is fixed to be UDP in the middle, RTP at the two end points. This model fits well in all the paths constructed for ICEBERG applications; however, it is not general enough and needs to be modified in future release.
Tight coupling between operators and connectors: Currently there are only two types of connectors: UDP and RTP. An operator can optionally support multiple types of connectors. In order to achieve that, connector requirements need to be part of path specifications.

Javadoc to APC:

Z. Morley Mao, zmao@cs.berkeley.edu

Last modified: Thu Jun 8 17:29:48 PDT 2000