i just read an interview of david aronson over at the cssanalytics blog

2
Wow, what an amazing book! I have now read this book cover to cover 3 times over the last month while getting my ha dirty with TSSB. This book gives implementation level detail o how to create and test predictive models or any class o market e"# etc...%. The implementation is done #sing the ree sotware called TSSB. The #se o TSSB re"#ires learning its asso command line lang#age, which is very very simple to learn yet e$tremely power#l. &s a point o comparison, to prog same models in the ' programming lang#age wo#ld re"#ire perhaps () or ()) times more lines o code I know this beca#se I programmed similar models in ' beore trying TSSB%. *pon irst #se, it took me all o ( ho#r to create my irst model in TSSB and it was o to the races rom there + comp#ted over )) predictive indicators and over ()) models. & great eat#re is the ability to combine models in -committees or -oracles , which is common practice to achieve optimal res#lts. /ne o the best eat#res is the ability to perorm a m#ltit#de o tests boostrap, monte carlo, and perm#tation% tha aid in ones conidence that the proposed model will in act generalize to #nseen data. In partic#lar, the perm#tati separates and meas#res a models -skill in comparison to its meas#red bias as well as l#ck d#e to trend. It is my h that the ma0ority o hedge #nds do not do s#ch tests and is the reason why so many ail so "#ickly. &nother great thing abo#t the book being implementation detail is that m#ch o the theory and best practices can be abstracted o#t o it. 1or e$ample, it is one thing to read abo#t -stationarity which I have read e$tensively abo comple$ greek lettered e"#ations and proos in other books% yet it was still not completely clear #ntil I act#al s with my own eyes thro#gh perormance o models I created. Sometimes oten, in my case%, getting to the implementati level detail developing and testing and reviewing res#lts% is the best way to #nderstand a concept. &dditionally, I e$actly how and why serial correlation co#ld d#pe someone into thinking a l#cky model had any skill. There also s#btle hints given thro#gho#t the book e.g. committees oten perorm better than individ#al models, regi specialization per model oten perorms better, models tend to perorm better when specializing in long2only or sho conte$ts + then combine to either a portolio or committee, more esoteric indicators e.g. non2trend type% are bette more comple$ models non2linear%, #se o more than 3 indicators oten overits a model a big time saver too!%. /ther great things abo#t TSSB and the book% 4ost o the indicators are scaled and normalized to an e$tent, which h take care o those little things that m#st be done beore getting to the act#al modelling. There are also speciic which help create one model or m#ltiple markets market regression, cross2market normalization, pooling variables, inde$es etc...% + oten s#ch implementation comple$ities are overlooked in theoretical books. Indicator selection is very clearly e$plained with the #se o tests s#ch as 5hi2S", 6on2'ed#ndant Indicator scannin e$amination o indicator2target relationships, ind gro#ps, e$cl#sion gro#ps, and model stepwise selection. It is worth mentioning that c#rrently TSSB is only a research a tool7 the models created cannot be simply e$ported trading program e.g. TradeStation% + b#t apparently this is planned or a #t#re release. In the mean time, 8erhaps my biggest gripe and ear + I will create an amazing system b#t be #nable to implement it in real lie. /n the othe some time and help, it sho#ld be possible to re2create a lot o #nctionality in ' and #se a trading program that i with '% or directly re2create it in a trading program that s#pports predictive #nctions. I initially o#nd o#t abo#t the book over at the 5ss&nalytics blog. Shortly ater, I bo#ght it. 9ere is why, the bo gap /ver the last < months I have taken an interest as a hobby to learn predictive modelling and apply it to the 1ore$ =#ring that those months, I learned the basics o the data mining process, programming and how to implement predict models in ', and how to apply it to 1ore$ o#tside o by day 0ob% . It>s been a lot o #n and have gotten somewhat res#lts so ar. I learned rom the ollowing materials in order 2 ?=ata Science or B#siness? by 1oster 8rovost @ Tom 1awcett rom my 4B& alma matter 6A*2Stern. This is a antasti and clear book written which describes data mining process and vario#s models rom a concept#al and logical perspec 2 /nline co#rse at Stanord rom 8roessor>s 9astie and Tibshirani, pioneers in the ield which blends theory with show how to implement vario#s classiication and regression models #sing '. 2 ?&pplied 8redictive 4odelling? by 4a$ #hn 2 an amazing book on implementing predictive models in ' 2 mostly #sin ' caret package, which is wrapper to over ()) predictive models in ' that essentially a#tomates re2sampling bootstr cross validation, data splitting% as well as model eval#ation and comparison.

Upload: eric-weinstein

Post on 04-Oct-2015

6 views

Category:

Documents


0 download

DESCRIPTION

review of tssb

TRANSCRIPT

Wow, what an amazing book! I have now read this book cover to cover 3 times over the last month while getting my hands dirty with TSSB.

This book gives implementation level detail of how to create and test predictive models for any class of market (equity, forex etc...). The implementation is done using the free software called TSSB. The use of TSSB requires learning its associated command line language, which is very very simple to learn yet extremely powerful. As a point of comparison, to program the same models in the R programming language would require perhaps 10 or 100 times more lines of code (I know this because I programmed similar models in R before trying TSSB).

Upon first use, it took me all of 1 hour to create my first model in TSSB and it was off to the races from there having since computed over 900 predictive indicators and over 100 models. A great feature is the ability to combine models in committees or oracles, which is common practice to achieve optimal results.

One of the best features is the ability to perform a multitude of tests (boostrap, monte carlo, and permutation) that will greatly aid in ones confidence that the proposed model will in fact generalize to unseen data. In particular, the permutation test separates and measures a models skill in comparison to its measured bias as well as luck due to trend. It is my hypothesis that the majority of hedge funds do not do such tests and is the reason why so many fail so quickly.

Another great thing about the book being implementation detail is that much of the theory and best practices can be abstracted out of it. For example, it is one thing to read about stationarity (which I have read extensively about through complex greek lettered equations and proofs in other books) yet it was still not completely clear until I actual saw its impact with my own eyes through performance of models I created. Sometimes (often, in my case), getting to the implementation level detail (developing and testing and reviewing results) is the best way to understand a concept. Additionally, I learned exactly how and why serial correlation could dupe someone into thinking a lucky model had any skill.

There also subtle hints given throughout the book (e.g. committees often perform better than individual models, regime specialization per model often performs better, models tend to perform better when specializing in long-only or short-only contexts then combine to either a portfolio or committee, more esoteric indicators (e.g. non-trend type) are better used in more complex models (non-linear), use of more than 3 indicators often overfits a model (a big time saver too!).

Other great things about TSSB (and the book): Most of the indicators are scaled and normalized to an extent, which helps take care of those little things that must be done before getting to the actual modelling. There are also specific functions which help create one model for multiple markets (market regression, cross-market normalization, pooling variables, use of indexes etc...) often such implementation complexities are overlooked in theoretical books.

Indicator selection is very clearly explained with the use of tests such as Chi-Sq, Non-Redundant Indicator scanning, visual examination of indicator-target relationships, find groups, exclusion groups, and model stepwise selection.

It is worth mentioning that currently TSSB is only a research a tool; the models created cannot be simply exported to a trading program (e.g. TradeStation) but apparently this is planned for a future release. In the mean time, Perhaps this is my biggest gripe and fear I will create an amazing system but be unable to implement it in real life. On the other hand, with some time and help, it should be possible to re-create a lot of functionality in R (and use a trading program that integrates with R) or directly re-create it in a trading program that supports predictive functions.

I initially found out about the book over at the CssAnalytics blog. Shortly after, I bought it. Here is why, the book fills a HUGE gap:

Over the last 6 months I have taken an interest as a hobby to learn predictive modelling and apply it to the Forex market. During that those months, I learned the basics of the data mining process, programming and how to implement predictive models in R, and how to apply it to Forex (outside of by day job) . It's been a lot of fun and have gotten somewhat successful results so far.

I learned from the following materials in order:

- "Data Science for Business" by Foster Provost & Tom Fawcett from my MBA alma matter NYU-Stern. This is a fantastic and clear book written which describes data mining process and various models from a conceptual and logical perspective.

- Online course at Stanford from Professor's Hastie and Tibshirani, pioneers in the field which blends theory with practice to show how to implement various classification and regression models using R.

- "Applied Predictive Modelling" by Max Kuhn - an amazing book on implementing predictive models in R - mostly using the R caret package, which is wrapper to over 100 predictive models in R that essentially automates re-sampling (bootstrapping, cross validation, data splitting) as well as model evaluation and comparison.

- White papers that outline some predictive models in forex by generating a bunch of technical indicators and then running predictive models on them (SVM, Random Forest etc..)

As you can see from the above, I have learned from some really smart people how to

1) Create and evaluate predictive models2) Program in R3) Apply to markets such as Forex

Of the above, I have gotten a fairly good grasp on #1 and #2.This book fills the gap on #3!

The following points in the blog interview that caught my initial interest in the book:

"It is possible for a model to have poor error reduction across the entire range of its forecasts while being profitable for trading becausewhen its forecasts are extremethey carry useful information. It is more appropriate to use financial measures such as the profit factor which are all included as objective functions within TSSB."- I literally just had that realization of the effective of EXTREME forecasts, just this last week - by plotting the residuals vs. prediction made this very clear.IF I read the book, I probably wouldn't have needed 6 months to discover that point!Though, the process of discovering it myself was fun too.

"Even the best conventional technical indicators have only small amount predictive information. The vast majority is noise. Thus the task is to model that tiny amount of useful information in each indicator"

Wow, is that true! I'm super interested in learning how to model just theusefulinformation in each indicator. Cool concept.

"In my opinion, the way to differentiate or uncover real opportunities currently lie in the clever engineering of new features- such as better indicators."

I've been using R's TTR package as my sole source of indicators. While there a LOT of indicators in the TTR package, I'm very interested in the 100 or so you mentioned in the software

--------------------

From an excited modeler. Cheers!