[ts-gen] Some Newbie Questions

R P Herrold herrold at owlriver.com
Sun Jul 13 12:46:26 EDT 2008

On Sun, 13 Jul 2008, Jay Strauss wrote:

> Hi,


> I've read a bunch (if not all) of the docs and perused the source.
> And have a couple of questions, that hopefully you'll answer.
> 1) It looks like you preload the database with "some" universe of
> symbols (Bonds, Indexes, Stocks, Futures...) from all the exchanges.
> Who's job is it to maintain those tables?  What about options upon
> stocks?

I am away from the office, so the date may be wrong but ... a 
couple months ago, we gathered all the IB symbols for all 
SecTypes available from a couple of sources, and studied 
cleanup needs.  The 'goodness' of the data for Options data 
were so 'nasty' that we could not see much point to doing some 
further post-processing of it.  A script exists, I think in 
the ./sql/ part of the tree, which was then used to do 
contract info details lookups, for cross checking data.  We 
took steps to avoid 'loading' IB's servers as part of that 
process.  The result was then used for generation of the 
symbol load scripts.

As this is the first time anyone has asked that question, we 
have not stated an 'external' answer to the 'ongoing 
maintenance' question; Our working approach has been to update 
for our needs, and add what we are asked for on the mailing 
list.  We have 'kicked around' doing a periodic re-dump, 
cleanup, and 'diff' to come up with just new symbols, so that 
prior UID's are not disturbed, but we are not set in stone on 

> 2) Maybe I'm missing something, or overwhelmed by the 
> documentation, but I don't see a good example of just 
> obtaining tick data.  There is some stuff that talks around 
> the subject.  That is there is some discussion of items in 
> the "subscription" table (I think it was called) that are 
> read to generate requests upstream, and theres "THE 
> TRADING-SHIM COMMAND SET" from the docs:

The mailing list archives for the last six months are fairly 
small, and are the best place to see changed and notification 
on such.  I put together a 'rough project guide' in March for 
newcomers to find:
see also:

Behind that, the code it self, particularly in the ./exs/ and 
./bin/ direfories should be studied for the sample scripts. 
The PDF is notably stale on this topic of the move to 'select' 
in the parser, as, again, it was not clear that anyone was 
reading it.

Certainly the switch to 'select' is not reflected in 
'manual.pdf' as we have been 'frying a bigger fish' in getting 
order journalling 'just right' as the focus of our work the 
mast few weeks.

My personal 'draft' notes are at:
I note the change in form in the Introduction at PDF page 38, 
and update the list of what I have re-written and what remains 
to do there as I do it., and print a fresh copy and work 
through it (usually on the weekends).

PDF page 96 states the form, as seen in ./bin/includes ... but 
that is probably stale as the .RB test scripts have rolled in.

> And in the "exs" directory, file "tick" it lists:
> #!./shim -f
> select tick   2 1;      wait  3;
> cancel tick   2;        wait  1;
> select tick 144 1;      wait  3;
> cancel tick 144;        wait  1;
> exit;

This is the 'correct' approach.

> I don't remember reading about a "select" verb.

As noted, nope ... new since the 'manual.pdf' was last 
addressed some months ago.

> Could you give me an example of just receiving tick data on the QQQQs.
> That is, all the steps, including any DB table queries or
> manipulation?

QQQQ, the ETF stock, or NQ, the Future?  I am not immediately 
sure that we have it as the retrieval of ETF symbols was also 
a but 'dirty' in the data, but we can and will certain add 
both, of not present.

Assuming it is present, it would be a numeric literal -- ugly 
and hard for a new person to find desired symbols.  Bill notes 
this, and we have on our docket, extensions to the 'BIND' 
command to make this less baroque.

I am away from the office where I do my shim coding, ans so 
cannot readily look up the answer needed.  I'll do so 

It will be as simple as:

echo "select tick 1234 1" | ./shim --data cout

    [stop it with a ^c]

  - or - using the 'shebang model, a script like this:

#!./shim -f
select tick   1234 1;      wait  3600;

    to yield an hour's data

the data will flow .... (assuming for the sake of 
example, that QQQQ was '1234')

> 3) In the docs it talks (several times) about reloading the 
> database (but first saving history data...), when doing 
> things like defining new contracts (which seems like a 
> simple and frequent task). Seriously?  Why wouldn't you just 
> add the respective rows into the RDBMS?  Instead you drop 
> and re-create the entire (I don't know the proper MySQL 
> speak, in Oracle it would be called a "schema") set of 
> tables that define the application.  I must be missing 
> something.

no -- Drops and recreation and repopulation of schema and 
table contents only happens with initial (shim not live) 
loads; picking up new 'inserts' onto the end of tables is what 
the database reload commands were contemplating -- this was 
needful two places: adding a wholly new symbol, and (formerly) 
with the way orders were handled.  One is rare (new symbols) 
and the other so common that we addressed the cumbersomeness 
of the approach (orders).

Hmmmm .... Actually that is quite old text, and was as it 
turned out, so hard to communicate that we re-designed around 
that. We need to chop those references out from appearing.

It turns out that one we had a gathered a wide symbology, the 
approach for adding symbols on the fly, and making the shim's 
data structures 'happy', was gross overkill.

There is one variance between Oracle and MySQL which we have 
noted which puzzles up -- we use (extensively) 
'auto_increment' on table UID fields, and these are also 
picked up and used in the C++ data structures, and checked for 
contiguousness as part of the shim's startup. See:
for a discussion of 'auto_increment' column attribute

It seems that with Oracle we would have a 'trigger' fire on an 
insert, and consult a look-aside max table, or do a max() on 
the prior values present, to do this.  Cumbersome, slow, and 
subject to race conditions in the general case.  I am probably 
missing something.  Is there something like it natively in 
Oracle? (After a recent inquiry, we looked a bit at porting 
into Oracle)

> 4) Am I correct in saying any/every downstream program needs 
> to read the output file (or pipe) and parse it for the data 
> they are interested in?

Our examples do it this way, for clarity.  We have designed 
the output for Unix 'filtering' to permit 'seeing' just what 
one is interested in, and send output to stdout, a file (which 
can be a pipe), or the syslog, depending on the output options 
specified [cout, file, logd] when the shim is started.

We have some thought on a library services model for the shim 
as well, but have not prototyped these yet.

We look forward to clarifying matters, and I'll look up the 
'QQQQ' tomorrow morning.

-- Russ herrold

More information about the ts-general mailing list