[ts-gen] Architecture for the downstream [long]
pippin at owlriver.net
Fri Apr 4 20:24:18 EDT 2008
An off-list poster has wondered what architecture should be used by
downstream programs with which to access the IB tws api through the
shim, and commented that our test programs suggest that scripts pipe
commands into the shim:
> ... you propose a command (script) --- pipe --- parser (so proccess)
> model of plugin ...
I'd like to take this opportunity to give trading-shim users some idea
of the downstream architecture that the current roadmap for the shim is
meant to support, list some principles underlying the design of the
shim, and then use divide-and-conquer to sketch a possible automatous
[no gui input] downstream trading application.
A: First the shim architecture principles:
1. Downstream input to the shim is via some sort of stream, so that
there is in-order delivery of text.
2. The shim command language is to be simple, concise, and, given some
knowledge of database values, human readable.
3. For those request types it supports, the shim should give access
to the full power of the api request.
4. A stored-value database bridges the mismatch between the relatively
brief commands and the longer binary requests that result.
5. The shim should serialize the text-format of all command, request, and
message events to a variety of output channels.
B: Our overall goal is to shield the downstream application from the
complexity of the IB tws api, while leaving the problems of analyzing
market conditions, computing trading signals, and evaluating candidate
trades to the downstream program. This means in particular that the
downstream program should not need to understand the database, or the
full IB tws api. In what follows, I'll use divide and conquer to
partition shim functionality from the downstream design sketch.
1. Adding new symbols to the database is hard for novices, and so we
should ship a complete symbol set with the shim --- we are making
good progress with this goal as I write. What's more, order
submission and accounting is being redesigned so that downstream
applications will not need to write to the database; all
predefined data will already be there at startup, and the shim
will record dynamic order state as orders are made. The
downstream, then, will no longer need to control the database,
which leaves the problem of controlling the shim.
2. For data mode, that is market data, market depth, history query,
and contract data collection in the absence of related orders,
downstream programs can pipe command text into the shim, and
partition the reply stream by the relatively few message formats.
This leaves risk mode, where orders must be actively managed.
3. Planned enhancements to the command language should make the
problem of downstream submission of orders, once given the
intention and market understanding to do so, straight-forward.
This leaves processing of the resulting event stream, with all
its events, attributes, and constant values.
4. Although the integrated event stream of commands, requests, and
messages is useful for testing, logging, and forensics, a
downstream risk-mode application need only consider the messages;
it knows what it has sent, trusts the shim to send the requests,
and wants only to read the reply traffic that results. After
all, the overall size of the api is not by itself a problem for
the downstream; only those commands of interest need be used.
So, although the existing output options will continue to
provide an integrated event stream, an additional output option
will be added which will send messages only to the standard
This narrows the trading application design problem to that of
reading and responding to input messages.
5. Much of such analysis is the bread-and-butter of downstream
implementation, where programs look at prices and decide how
to respond to market conditions. There are, however, three
kinds of message analysis complexity left to consider that
the downstream client should be shielded from:
a. Error message text decomposition and specialization;
b. Compressed, that is stateful, message event analysis; and
c. Object construction from message text (unmarshalling).
For each of these aspects of message processing, I'd like to
propose that we start with a scripting language that provides
well-designed and easy-to-use facilities for class definition
and object creation, and add to the shim sources text from this
language in three categories:
1. library class files for each of the IB tws message types;
2. a reuseable client harness module that would:
* relay command text from clients to the shim,
* construct message objects from the resulting replies, and
* provide these event objects, on demand, to the clients.
3. a sample DownstreamClient class that would emit command
strings to be sent to the shim, and absorb message objects
from the shim in response.
The client harness, as part of message object construction, would
need to decompose error messages, and translate the results to
specialized error objects.
For the overloaded message types, market data and market depth,
the messages are interpreted according to previous events, so
that there is implicit (market data) or explicit (market depth)
state. Although alternatives exist, it is reasonable to ask
the client harness to take care of this event translation, as
part of message object construction above.
It's questionable whether such analysis should be hardwired into
the shim --- its charter has always been to convert the binary
messages to text form, and, leaving aside limited per-attribute
translation and message augmentation at the margins, transmit
the full information of the IB tws api message as a standalone
event. Writing reuseable code in some well-designed scripting
language is a reasonable alternative.
Note that, lacking a well-defined specification for the errors
sent by the IB tws api, the task of error object specialization
would continue to be a work-in-progress for some time to come,
and improvements to this code would be coupled to increased use
of, and experience with, downstream client development.
For other message event processing, pass the buck! It's up to the
downstream developer to add the appropriate decision logic to the
sample DownstreamClient class in order to obtain a finished trading
application. This decision logic would include exceptional condition
handling for errors; though the harness might translate error
messages, it's the client that would have to decide what to do with
I'd like to point out here my desired candidate language for
implementation of the downstream harness, per-message class files,
and sample DownstreamClient class:
To recap, downstream applications for trading will be able to avoid:
1. Database writes; those will be solely the responsibility
of the shim.
2. Bulk data collection; simple, one-off scripts will typically
be used here.
3. Intricate command processing; just stuff text into the shim.
4. Analysis of the integrated (command-request-message) event stream;
just look at the results.
5. Translating from gui-oriented, stateful market data to a flat,
relational form; error message decomposition; and message
ummarshalling for object construction; these will have already
been done by the harness.
The above choices, although probably obvious in retrospect, solve some
1. The kitchen sink problem: the number of events, tables, attributes
and predefined attribute values is very large, enough to hinder
downstream application development (B.1 and B.4).
2. The firehose problem: market tick and depth subscriptions, as well
as history data queries can return enormous amounts of data (B.2)
3. The morph-to-shim problem: any downstream client that parses and
typechecks the integrated event stream eventually morphs into
a less-efficient analog of the shim (B.5).
The last is worth commenting on; the shim accepts an easy-to-generate
command language, marshalls the related requests on the wire,
meticulously typechecks the resulting replies, formats them as text
records with newline terminators, and provides logging facilities for
all of the forgoing events.
The downstream need not, and should not, be responsible for *any* of
the above. In particular, the client harness need not do any per-event
message format checking. In the ideal case, the constructor procedures
for the per-message class files written in Ruby could be generated by
the shim as part of a specialized meta mode, so that the maintenance
effort of keeping the class definitions synchronized with the message
log formats would be eliminated.
More information about the ts-general