[ts-gen] Architecture for the downstream [long]

Bill Pippin pippin at owlriver.net
Fri Apr 4 20:24:18 EDT 2008


An off-list poster has wondered what architecture should be used by
downstream programs with which to access the IB tws api through the
shim, and commented that our test programs suggest that scripts pipe
commands into the shim: 

> ... you propose a command (script) --- pipe --- parser (so proccess)
> model of plugin ...

I'd like to take this opportunity to give trading-shim users some idea
of the downstream architecture that the current roadmap for the shim is
meant to support, list some principles underlying the design of the
shim, and then use divide-and-conquer to sketch a possible automatous
[no gui input] downstream trading application.

A:  First the shim architecture principles:

  1.  Downstream input to the shim is via some sort of stream, so that
      there is in-order delivery of text.

  2.  The shim command language is to be simple, concise, and, given some
      knowledge of database values, human readable.

  3.  For those request types it supports, the shim should give access
      to the full power of the api request.

  4.  A stored-value database bridges the mismatch between the relatively
      brief commands and the longer binary requests that result.

  5.  The shim should serialize the text-format of all command, request, and
      message events to a variety of output channels.

B:  Our overall goal is to shield the downstream application from the
complexity of the IB tws api, while leaving the problems of analyzing
market conditions, computing trading signals, and evaluating candidate
trades to the downstream program.  This means in particular that the
downstream program should not need to understand the database, or the
full IB tws api.  In what follows, I'll use divide and conquer to
partition shim functionality from the downstream design sketch. 

  1.  Adding new symbols to the database is hard for novices, and so we
      should ship a complete symbol set with the shim --- we are making
      good progress with this goal as I write.  What's more, order
      submission and accounting is being redesigned so that downstream
      applications will not need to write to the database; all
      predefined data will already be there at startup, and the shim
      will record dynamic order state as orders are made.  The
      downstream, then, will no longer need to control the database,
      which leaves the problem of controlling the shim.
       
  2.  For data mode, that is market data, market depth, history query,
      and contract data collection in the absence of related orders,
      downstream programs can pipe command text into the shim, and
      partition the reply stream by the relatively few message formats.
      This leaves risk mode, where orders must be actively managed.

  3.  Planned enhancements to the command language should make the
      problem of downstream submission of orders, once given the
      intention and market understanding to do so, straight-forward.
      This leaves processing of the resulting event stream, with all
      its events, attributes, and constant values.

  4.  Although the integrated event stream of commands, requests, and
      messages is useful for testing, logging, and forensics, a
      downstream risk-mode application need only consider the messages;
      it knows what it has sent, trusts the shim to send the requests,
      and wants only to read the reply traffic that results.  After
      all, the overall size of the api is not by itself a problem for
      the downstream; only those commands of interest need be used.

      So, although the existing output options will continue to 
      provide an integrated event stream, an additional output option
      will be added which will send messages only to the standard
      output.

      This narrows the trading application design problem to that of
      reading and responding to input messages.

  5.  Much of such analysis is the bread-and-butter of downstream
      implementation, where programs look at prices and decide how
      to respond to market conditions.  There are, however, three
      kinds of message analysis complexity left to consider that
      the downstream client should be shielded from:

      a.  Error message text decomposition and specialization;
      b.  Compressed, that is stateful, message event analysis; and
      c.  Object construction from message text (unmarshalling).

      For each of these aspects of message processing, I'd like to
      propose that we start with a scripting language that provides
      well-designed and easy-to-use facilities for class definition
      and object creation, and add to the shim sources text from this
      language in three categories:

        1.  library class files for each of the IB tws message types;

        2.  a reuseable client harness module that would:

            * relay command text from clients to the shim,
            * construct message objects from the resulting replies, and
            * provide these event objects, on demand, to the clients.

        3.  a sample DownstreamClient class that would emit command
            strings to be sent to the shim, and absorb message objects
            from the shim in response.

      The client harness, as part of message object construction, would
      need to decompose error messages, and translate the results to
      specialized error objects.

      For the overloaded message types, market data and market depth,
      the messages are interpreted according to previous events, so
      that there is implicit (market data) or explicit (market depth)
      state.  Although alternatives exist, it is reasonable to ask
      the client harness to take care of this event translation, as
      part of message object construction above.

      It's questionable whether such analysis should be hardwired into
      the shim --- its charter has always been to convert the binary
      messages to text form, and, leaving aside limited per-attribute
      translation and message augmentation at the margins, transmit
      the full information of the IB tws api message as a standalone
      event.  Writing reuseable code in some well-designed scripting
      language is a reasonable alternative.

      Note that, lacking a well-defined specification for the errors
      sent by the IB tws api, the task of error object specialization
      would continue to be a work-in-progress for some time to come,
      and improvements to this code would be coupled to increased use
      of, and experience with, downstream client development. 

For other message event processing, pass the buck!  It's up to the
downstream developer to add the appropriate decision logic to the
sample DownstreamClient class in order to obtain a finished trading
application.  This decision logic would include exceptional condition
handling for errors; though the harness might translate error
messages, it's the client that would have to decide what to do with
them.

I'd like to point out here my desired candidate language for
implementation of the downstream harness, per-message class files,
and sample DownstreamClient class:

    Ruby!

To recap, downstream applications for trading will be able to avoid:

  1.  Database writes; those will be solely the responsibility
      of the shim.

  2.  Bulk data collection; simple, one-off scripts will typically
      be used here.

  3.  Intricate command processing; just stuff text into the shim.

  4.  Analysis of the integrated (command-request-message) event stream;
      just look at the results.

  5.  Translating from gui-oriented, stateful market data to a flat,
      relational form; error message decomposition; and message
      ummarshalling for object construction; these will have already
      been done by the harness.

The above choices, although probably obvious in retrospect, solve some
critical problems:

  1.  The kitchen sink problem: the number of events, tables, attributes
      and predefined attribute values is very large, enough to hinder
      downstream application development (B.1 and B.4).
      
  2.  The firehose problem: market tick and depth subscriptions, as well
      as history data queries can return enormous amounts of data (B.2)

  3.  The morph-to-shim problem: any downstream client that parses and
      typechecks the integrated event stream eventually morphs into
      a less-efficient analog of the shim (B.5).

The last is worth commenting on; the shim accepts an easy-to-generate
command language, marshalls the related requests on the wire,
meticulously typechecks the resulting replies, formats them as text
records with newline terminators, and provides logging facilities for
all of the forgoing events.

The downstream need not, and should not, be responsible for *any* of
the above.  In particular, the client harness need not do any per-event
message format checking.  In the ideal case, the constructor procedures
for the per-message class files written in Ruby could be generated by
the shim as part of a specialized meta mode, so that the maintenance
effort of keeping the class definitions synchronized with the message
log formats would be eliminated.

Thanks,

Bill Pippin


More information about the ts-general mailing list