[ts-gen] questions on journal tables

Bill Pippin pippin at owlriver.net
Mon May 4 14:53:47 EDT 2009


Nils,

You ask:
 
> Is there something like an timeout parameter which could be tuned to
> ensure that all OrderStatus messages are catched?

In short, since no timeout occurs, no.

There are two separate issues here, of which you're almost certainly
considering just the first.  They are, for OrderStatus messages: first,
their existence; and second, the occurrence of duplicates, and how the
shim handles them.  That is, 0 vs 1, or 1 vs many.

About existence, as far as the shim is concerned, OrderStatus messages
occur when they occur; as a matter of fact, essentially all IB tws api
messages are treated as asynchronous events.  We've considered putting
various timers in the shim, and if we did order tracking would come
first, but such processing seems to fit better in a downstream script.
We've worked out a division of responsbility where the shim keeps track
of time, and marks log records accordingly, and downstream scripts
decide what to do about such times.

I use the tail.window script to "tail -f" the log file, so that I can
see if order status messages occur before my order test scripts end.
You'll probably want to give each shim risk mode process at least a few
seconds after submitting an order before termination.  I kill off
data mode shims with wild abandon, but orders warrant more care; at
least give the poor critter time to get its journal posts in with
the order.

Now, about dups, the handling of which should be invisible to you:
The shim checks for duplicate order status and open order messages
by comparing them to others that have occurred.  If the shim fails
to recognize a duplicate, there is no problem, since for these
posts the sql insert statements include mysql's "ignore" adverb, to
silently discard duplicates.

If the shim were to falsely believe a message to be a duplicate
when it was not --- and I don't see how this could happen --- then
you might be missing order status posts, which you were concerned
about with an earlier question.  In any case, the shim doesn't cache
order status messages; they are naturally grouped together via tcp
reads, duplicates are dropped, and the remainder written in a single,
batched insert.

The time frame for this batched insert is determined by the
frequency with which the IB tws sends messages; experience
indicates that they come in runs that are on the order of low
double digit milliseconds apart.  In any case, the shim uses
the select() system call to wait for IO, with a timeout of 20
milliseconds from its last read, where 20ms each is the upper
limit on the average rate for requests according to the IB tws
api.

I call this "at most 20ms time interval" a time slot, and it
corresponds to main loop iterations for the shim.  The shim
waits on IO; wakes up on any command or message input, or the
passage of 20ms if there is no input; and performs main loop
processing each time this return from select() occurs.

That processing consists of parsing commands and messages, routing
them to various agents for processing, dequeuing a command if one
is waiting, mapping it to a request, sending that request, posting
journalled events, and logging all events to the various log outputs.

So, if you wondered why the time stamps in the log were grouped
the way they are, and how the shim could batch journal dbms posts
together, it's due either to the batching already performed by the
IB tws, or else possibly the kernel streams code blocking short
writes together.

Thanks,

Bill


More information about the ts-general mailing list