[ts-gen] Fair merge [Was: Do I really need to tail -f ShimText?]

Bill Pippin pippin at owlriver.net
Mon Sep 14 18:35:52 EDT 2009


I'll be posting distinct responses, since you raise various topics, and
here I'll take the simplest one first:

> Currently I also separate between data stream and order data stream
> on FIFO pipes. ...

Why?  Are you using two shim sessions, or have you split the original
log stream earlier?  If the former, why not one shim; and if the second,
why not dup the original?

> ...  Is there a simple way to concatenate to pipes to a single stream?

This is called a merge, and more precisely, given your concerns for
latency, a *fair* merge.  So, the key question here is how you can
ensure fairness; otherwise, if you don't care about ordering or
latency, cat or cat with sort works just fine.  Since we care about
timing here, select() is the right way to multiplex IO streams.  In
fact, the use of select() is sometimes referred to as multiplexed IO.

I expect that the future ruby sample client will use select() to
interleave reads from the stdin [cout option] and the stderr, the
latter to echo exceptional conditions to the terminal that would
otherwise be hidden or delayed when the shim session was opened via

> ...  And in awk there is no select() I think - . ...

Given the lack of typing in awk, even if Gnu awk adds access to the
select() library call, I wouldn't want to try to use select() from
an awk script.  If you're committed to awk, and you must multiplex
multiple streams to one input channel for your awk script, you could
always write an IO adaptor script using say ruby to do the merge.
Easier, though, to not split the log stream in the first place.

> ...  Maybe it is not necessary to separate these streams at all.

True.  The data mode is provided to *allow* you to separate data
collection from order management, e.g. history retrieval from
signal processing and order submission, say if you're researching
one symbol and trading another; there's no reason why you should
feel *required* to split processing for an individual symbol or

I've always expected that downstream order scripts would use contract
info queries to start, and market data subscriptions to check that the
market feed was working.  It doesn't make sense to separate concurrent
market data subscriptions and place order requests for any given symbol.
Keep in mind here that the command language for risk mode is a proper
superset of that for data mode.

There are currently no advantages that I can recall to using data mode
other than the safety of eliminating the order-related commands from
the control language, and even in the future, the only advantage I can
imagine is perhaps a more forgiving policy for history query bandwidth
exhaustion; currently the shim drops the offending history query, and
in the future, for data mode only and given the appropriate option,
the shim may block until the query period has passed.

I'll deal with your questions about order status records, and the
debugging of problems thereof, in a follow-on post.



More information about the ts-general mailing list