[ts-gen] Some Newbie Questions
R P Herrold
herrold at owlriver.com
Sun Jul 13 12:46:26 EDT 2008
On Sun, 13 Jul 2008, Jay Strauss wrote:
> I've read a bunch (if not all) of the docs and perused the source.
> And have a couple of questions, that hopefully you'll answer.
> 1) It looks like you preload the database with "some" universe of
> symbols (Bonds, Indexes, Stocks, Futures...) from all the exchanges.
> Who's job is it to maintain those tables? What about options upon
I am away from the office, so the date may be wrong but ... a
couple months ago, we gathered all the IB symbols for all
SecTypes available from a couple of sources, and studied
cleanup needs. The 'goodness' of the data for Options data
were so 'nasty' that we could not see much point to doing some
further post-processing of it. A script exists, I think in
the ./sql/ part of the tree, which was then used to do
contract info details lookups, for cross checking data. We
took steps to avoid 'loading' IB's servers as part of that
process. The result was then used for generation of the
symbol load scripts.
As this is the first time anyone has asked that question, we
have not stated an 'external' answer to the 'ongoing
maintenance' question; Our working approach has been to update
for our needs, and add what we are asked for on the mailing
list. We have 'kicked around' doing a periodic re-dump,
cleanup, and 'diff' to come up with just new symbols, so that
prior UID's are not disturbed, but we are not set in stone on
> 2) Maybe I'm missing something, or overwhelmed by the
> documentation, but I don't see a good example of just
> obtaining tick data. There is some stuff that talks around
> the subject. That is there is some discussion of items in
> the "subscription" table (I think it was called) that are
> read to generate requests upstream, and theres "THE
> TRADING-SHIM COMMAND SET" from the docs:
The mailing list archives for the last six months are fairly
small, and are the best place to see changed and notification
on such. I put together a 'rough project guide' in March for
newcomers to find:
Behind that, the code it self, particularly in the ./exs/ and
./bin/ direfories should be studied for the sample scripts.
The PDF is notably stale on this topic of the move to 'select'
in the parser, as, again, it was not clear that anyone was
Certainly the switch to 'select' is not reflected in
'manual.pdf' as we have been 'frying a bigger fish' in getting
order journalling 'just right' as the focus of our work the
mast few weeks.
My personal 'draft' notes are at:
I note the change in form in the Introduction at PDF page 38,
and update the list of what I have re-written and what remains
to do there as I do it., and print a fresh copy and work
through it (usually on the weekends).
PDF page 96 states the form, as seen in ./bin/includes ... but
that is probably stale as the .RB test scripts have rolled in.
> And in the "exs" directory, file "tick" it lists:
> #!./shim -f
> select tick 2 1; wait 3;
> cancel tick 2; wait 1;
> select tick 144 1; wait 3;
> cancel tick 144; wait 1;
This is the 'correct' approach.
> I don't remember reading about a "select" verb.
As noted, nope ... new since the 'manual.pdf' was last
addressed some months ago.
> Could you give me an example of just receiving tick data on the QQQQs.
> That is, all the steps, including any DB table queries or
QQQQ, the ETF stock, or NQ, the Future? I am not immediately
sure that we have it as the retrieval of ETF symbols was also
a but 'dirty' in the data, but we can and will certain add
both, of not present.
Assuming it is present, it would be a numeric literal -- ugly
and hard for a new person to find desired symbols. Bill notes
this, and we have on our docket, extensions to the 'BIND'
command to make this less baroque.
I am away from the office where I do my shim coding, ans so
cannot readily look up the answer needed. I'll do so
It will be as simple as:
echo "select tick 1234 1" | ./shim --data cout
[stop it with a ^c]
- or - using the 'shebang model, a script like this:
select tick 1234 1; wait 3600;
to yield an hour's data
the data will flow .... (assuming for the sake of
example, that QQQQ was '1234')
> 3) In the docs it talks (several times) about reloading the
> database (but first saving history data...), when doing
> things like defining new contracts (which seems like a
> simple and frequent task). Seriously? Why wouldn't you just
> add the respective rows into the RDBMS? Instead you drop
> and re-create the entire (I don't know the proper MySQL
> speak, in Oracle it would be called a "schema") set of
> tables that define the application. I must be missing
no -- Drops and recreation and repopulation of schema and
table contents only happens with initial (shim not live)
loads; picking up new 'inserts' onto the end of tables is what
the database reload commands were contemplating -- this was
needful two places: adding a wholly new symbol, and (formerly)
with the way orders were handled. One is rare (new symbols)
and the other so common that we addressed the cumbersomeness
of the approach (orders).
Hmmmm .... Actually that is quite old text, and was as it
turned out, so hard to communicate that we re-designed around
that. We need to chop those references out from appearing.
It turns out that one we had a gathered a wide symbology, the
approach for adding symbols on the fly, and making the shim's
data structures 'happy', was gross overkill.
There is one variance between Oracle and MySQL which we have
noted which puzzles up -- we use (extensively)
'auto_increment' on table UID fields, and these are also
picked up and used in the C++ data structures, and checked for
contiguousness as part of the shim's startup. See:
for a discussion of 'auto_increment' column attribute
It seems that with Oracle we would have a 'trigger' fire on an
insert, and consult a look-aside max table, or do a max() on
the prior values present, to do this. Cumbersome, slow, and
subject to race conditions in the general case. I am probably
missing something. Is there something like it natively in
Oracle? (After a recent inquiry, we looked a bit at porting
> 4) Am I correct in saying any/every downstream program needs
> to read the output file (or pipe) and parse it for the data
> they are interested in?
Our examples do it this way, for clarity. We have designed
the output for Unix 'filtering' to permit 'seeing' just what
one is interested in, and send output to stdout, a file (which
can be a pipe), or the syslog, depending on the output options
specified [cout, file, logd] when the shim is started.
We have some thought on a library services model for the shim
as well, but have not prototyped these yet.
We look forward to clarifying matters, and I'll look up the
'QQQQ' tomorrow morning.
-- Russ herrold
More information about the ts-general