Automatic Path Creation Service - APC
This document explains the architecture of the APC (Automatic Path
Creation Service), which is a component within ICEBERG to establish
data flows between heterogeneous communication endpoints by composing
appropriate transoding operators. Call agents of the ICEBERG
artchitecture make requests to the APC
service for data path creation by providing the required endpoint
information. APC service provides a clean separation of the data
paths from the control paths by encapsulating the data path creation and
instantiation process.
Table of Contents
Architecture of APC:
This section describes the terminology used in the following sections.
Operators
An operator is a unit of computation on some data. operators are
strongly-typed: they have a clear definition of the input and output
types. In addition, operators have various attributes such as
communication protocol, computational requirement, static data input
(e.g. a database) etc. Only based on these attributes, is automatic
composition of services possible.
Operators can have multiple inputs
and outputs (i.e. aggregators, broadcasters). Communication using
multicast is one way to deal with this.
Operators have only soft-state: this means that if an
operator fails and is restarted, there is no need to recover its state
information. Application level protocol is needed to provide
guarantees such as reliability, ordered, end-to-end data delivery.
Connectors
A connector is an abstraction of the Application Data Unit (ADU)
transport mechanism between two operators (e.g, RTP connector, UDP
connector, etc.) Each connector is characterized by a specific transport
protocol.
Overview:
APC service is responsible for creating, maintaining, and
eventually freeing up data paths consisting of operators strung together
by connectors. The overall path construction process can be
illustrated through the following diagram:
It is a continuous process of optimization.
- Step 1: Logical Path Creation:
Logical path is defined to be an ordered sequence of
operators. Operators are registered using the SDS (Service
Discovery Service). Each operator has an XML description associated with
it, describing its attributes such as input type, output type,
communication protocols supported. More importantly, each
operator has various cost metrics associated with it.
(e.g. computational latency, memory usage, and other application
specific QoS metrics). Users provide the goals for the
optimization process. Depending on the applications, these
goals can vary. Logical path creation process should return a
list of possible paths, ordered by the decreasing cost based on
the user's input parameters for optimization.
Currently, we simplified the implementation by using only the
format type to describe all operators used. Furthermore,
currently the APC service does not use any bootstrapping SDS to dynamically
discover operators; instead it receives this informatoin by
given static configurations beforehand. The search for the
optimized path is also not done. A depth-first-search
implementation is used to find the first matching path, since
the types of paths constructed right now usually only consist of
one to four operators. Consequently, such optimizations are
unnnecessary.
- Step 2: Physical Path Creation:
This step is tightly coupled with the first step. There are two
types of operators, those that can be created on the fly by
downloading the code, and those that cannot be dynamically
instantiated and only existing instances of them can be used.
The instances of running operators are again advertised and
discovered through the SDS service. Again, for simplification,
this implementation only deploys operators that are dynamically
created and destroyed after the path has been torn down.
- Step 3: Path instantiation, execution, maintenance, querying:
Given the physical path descriptions, the instances of
operators are created and data flow is started from the source
endpoint. During the lifetime of the path, the APC service
actively monitors the liveness of the operators to make sure
that they are functional. Any operator can also report problems
to the APC service about its neighboring operators, so that the
path can be repaired when necessary. The control path plays an
important role here, since it is how operator deletion,
insertion, repair can be accomplished. Control path is used for
exception handling, controling parameters of path components,
monitoring and analyzing path performance; thus, it needs to be
independent of data paths and be highly robust. A data path can be
"walked" using the control path. The APC service definitely has
the handle to all the path components. In addition, the control
path can also overlap the data path: each path component (operator)
can have a handle to its two neighboring operators. The control
paths allow the querying of the work status of the operators.
In the next release, we plan to implement features that allow
one can query application-specific information such as what
percentage of the data has been processed already.
- Step 4: Path tear down:
Once the path has finished providing the service, or the end users
have sent a termination request, the path can be torn down.
Resources can be freed. For optimization reasons, the path can
be cached and reused in the future, both the logical path
description and its physical path description. Thus, the path
can be kept without being torn down.
Implementation:
This section describes some of the interesting implementation details
and functionalities provided by APC in the current release.
Simplifications of the design:
For the purpose of prototyping, we made a number of simplifications to
the above ideal design:
- Simplified description of operators: currently all operators are
uniquely described using its input and output type. No XML
descriptions are used right now. Furthermore, we assume that each
operator has a single input and a single output. Operators are
created dynamically and destroyed once after their use. APC
service is statically configured with the all known operators
beforehand.
- Process per operator model: each operator instance is a process
that is not shared among other instances of the paths.
APC service:
The APC service itself is a Ninja iSpace cluster service. The idea is
that multiple nodes will implement the APC service, providing
fault-tolerance and load-balancing features. This will require the
sharing of data among the nodes. Thus, in this release, we only
consider a single-node APC service.
Connection Manger:
This is also an iSpace service that must be implemented by all nodes
on which operators will run. Connection Manager has an API that deals
with loading, creating, repairing, destroying operators. It is
also reponsible for creating and maintaining connections from or to
the operators running on the node where the connection manager
resides. A connector consists of two objects, each one is part of the two
operators in communication. One part is a writer object; the other is
a reader object.
For simplicity, the operators created by each connection
manager have a process per usage model: operators are not shared among
different path instances. Operators are created on demand for each
new path instantiated and destroyed after the path is requested to be
torn down.
The physical paths created by the APC service always contain the
source endpoint operator. In certain cases, it does not make sense to
have such an operator; thus, it can be a null
operator. This is provisioned by a dummy operator and a dummy
connection manager implementing the connection manager interface.
Partial Path Repair (PPR):
At any given time, any operator can fail, or the connection between
two operators can be broken. In this release, we support partial path
recovery: rather than tearing down and restarting the entire path,
failing operators are restarted on new nodes in order to introduce
minimal disturbance at the end users.
Failure is discovered by catching the I/O exception at each
operator from its neighboring communicating operators. The failing
operator is identified by the APC service and restarted at a different
node.
Existing implementation of the types of paths:
from MP3 to gsm
from gsm/pcm to pcm/gsm
from text to gsm/pcm
from pcm/gsm to text
API to APC explained:
The API to APC is rather simple and consists only of the following
five methods:
- int pathRequest(EndPointInfo srcEndpt, EndPointInfo
destEndpt)
This function is called to request to have a
new data path built from the source point with srcEndpt to
destination point with destEndpt. EndPointInfo is a data
structure containing relevant information needed for path
construction (e.g. data format, IP address and port number etc.)
Details are explained in the IAP section.
Each path constructed is identified by a unique path ID which is
returned to the caller of the method. This is needed as an
input parameter of other functions.
- int pathRequest(int pathid, String
inputType, String outputType,
String inputHost, String inputPort,
String outputHost, String outputPort,
Object argList1[], Object argList2[],
String cName, String
sessionID)
This function is the same as the above
function with the contents of the EndPointInfo explicitly passed
in rather than encapsulated in a data structure. The
functionality is the same.
- void changePathEndPt(int pathId, String
newAddress, String
newPort)
This function is used to change the destination
IP address and port of the first operator of the physical path
constructed identified by pathId. This
address specifies where the data are sent to. Currently, it is
used to change the Vat operator's input argument from unicast to
multicast address or vice versa.
- void changePath(int pathId, EndPointInfo
newSrcEndPtInfo)
This function is a more
general implementation compared to the above function to change
the first operator's properties
of the physical path constructed identified by the pathId. Ideally, the
first operator should be changed without restarting the entire
path. For simplicity, the current implementation involves
tearing the entire path and restarting it. Future release will
definitely optimize this implementation.
- void pathTearDown(int
pathId)
This function tears down the path
constructed identified by pathId to free up any used resources.
- void repairPath(String opName, String
inOutType, int connectionType, int pathID)
This function repairs
the path of the given path ID from the given operator's with the
name opName's perspective. If inOutType is "in", this means the
operator from which the given operator receives input data from
is broken. If inOutType is "out", it is the operator from which
the output data is received is broken. ConnectionType describes
the type of connection being repaired. This is called by the
connection manager of the operator noticing its neighboring
operator's failure.
How to add operators and connectors to the APC package:
- To add operators: If the operator is a unix process
command, one needs to extend ProcessOperator class. All operators
need to specify its natural input and output block size to help
reduce overhead of sending small data over the wire. Otherwise,
one needs to extend Operator class to specify its connection
mechanism and various other properties of the operator.
- To add connectors: If the connector is a stream connector,
one needs to create a reader which needs to implement
StreamConnectorReaderIF interface and a writer implementing the
StreamConnectorWriterIF. Currently, only streaming connectors are
supported: UDP and RTP.
How to run the APC service:
In the iSpace configuration file, which is the file passed as argument
to ninja.iSpace.Main, one needs to specify this entry to run the
APC service:
iceberg.APCpath.services.APC host_file_name
The host_file_name contains the name of the hosts available to
run operators. Each host name is separated by a new line; an
example can be found here.
On each of the host specified in the host_file_name, a
connection manager service or a service that implements the
connection manager interface needs to be run. For example, one of
the following entries needs to be in the iSpace configuration file
on each of those hosts.
iceberg.APCpath.services.ConnectorMgr
iceberg.APCpath.services.DummyCM
iceberg.APCpath.services.VatCM
iceberg.APCpath.services.GsmCM
Known caveats/limitations of APC:
This section documents the known caveats/limitations of APC. Due to time
limitations, they are not fixed, but do not affect the functionality
of the rest of ICEBERG components. In the next release, these bugs
will definitely go away.
- Each path assumes that each operator uniquely identified by its
class name appears only once. Bug fix: use sequence number
and class name together to uniquely identify each operator
within a path.
- Process per operator model: each operator instance is a
process. No multithreading support exists -- operators are not
shared among different instances of paths. The result of this
simplified model is noticable latency due process execs in
real-time applications (text-speech, mp3-gsm paths) and limited
scalability in terms of the number of operators that can be
supported per node.
- Lack of flow control: Real-time applications need to have
necessary flow control to buffer or slow down data when
necessary. Currently, no generic flow control mechanism
exists. Each operator has a known output block size and input
block size. The minimal size of output data from any operator
is specified by its output block size to limit the overhead of
sending data over the wire.
- Simplified connector model: The current connector model is fixed
to be UDP in the middle, RTP at the two end points. This model
fits well in all the paths constructed for ICEBERG applications;
however, it is not general enough and needs to be modified in
future release.
- Tight coupling between operators and connectors: Currently there
are only two types of connectors: UDP and RTP. An operator can
optionally support multiple types of connectors. In order to
achieve that, connector requirements need to be part of path
specifications.
Z. Morley Mao,
zmao@cs.berkeley.edu
Last modified: Thu Jun 8 17:29:48 PDT 2000