Data Sets

The sport of kings.
Post Reply
robsmith
Posts: 76
Joined: Wed Aug 25, 2010 12:19 pm

Guys,

After reading some posts on developing strategies I have been running an automated strategy on the in play racing markets. Once I have a reasonable set of data I will look to profile races and remove those with negative expectations. The question I have is how big a data set do I need to be able to draw conclusions? 500 races, 1,000 races, etc?

Thanks,

Rob
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

I don't think it's an exact science as it depends what sort of strategy you're running. If it's primarily a betting strategy (i.e. just taking open positions) rather than trading strategy (i.e. opening and closing positions quite frequently), then you'll need a lot more historical data.

To generalise the above, you can look at the volatility of your returns by analysing in Excel or something. Betting strategies would have higher ups and downs, and so you need more data to verify it's working.

To give you an idea, I'm also developing an in-play strategy which has a daily PnL which is positive 78% of the days, and I'm using roughly a 3 month history. Of course this is only from backtesting.

The trickiest part with looking at historical data is that you have to be very careful not to overfit your strategy. So when you come to "profile and remove races with negative expectations" you need to be clear about your reasoning. Removing all horses with names that started with a "J" may have improved your historical results, but it's clearly just coincidence and non-sensical. Really you need to keep a set of data to do out-of-sample testing once you've done any tweaking.

Perhaps someone else with more formal statistical training can suggest how to calculate some tests for significance. Personally, I just plot a chart and see if it looks like it's going up or not! At the end of the day you need to be comfortable yourself with how volatile the returns are. If my chart doesn't look consistent enough to me, then I don't put money on it.
User avatar
Dublin_Flyer
Posts: 847
Joined: Sat Feb 11, 2012 10:39 am

I'll 2nd what Xitian says about overfitting. I've had and am still having long long debates on another forum about systems people use on Horseracebase.
At what point does the line come where you're using continuous previous history which has been profitable, or just eliminating some features like class/distance/track etc, because they don't make pretty figures, so you're backfitting yourself into a corner.
Personally I prefer to leave my systems as wide open as possible so I'm not backfitting stats to hope the future ones follow up, unless there's something that really stands out, e.g. 3/45 win rate in Beginners Chases, 7/45 place rate Beginners Chases, both for massive losses. I'm a gambler so I wouldn't be touching them, but if I was a trader they wouldn't be near in contention for winning, so I'd steer clear IR too.

In a nutshell, if you think you have valid and thoroughly thought through reasons for including or excluding things in your trading, then you should be ok, if you're filtering willy-nilly for the big + or big -, then shit gets ugly.
Wyndon
Posts: 237
Joined: Sun Nov 13, 2011 10:14 am

Agree with Xitian and Dublin Flyer. I'm developing an in-play automated strategy which seems to be working for All-Weather and the Jumps - but not for ordinary flat racing. It's really bugging me that before I exclude flat races all together, I would like to have some intuitive reasoning for doing so.
Post Reply

Return to “Trading Horse racing”