Recently I was able to develop a full 10 year dataset of UK & IE horse racing data, which I combined with Betfair historical data. My initial plan was to use it for a back to lay strategy, however I went deep into ML and started building and testing models…I was about 5 models in before my initial tests started to show some promise.
The model basically scores pre race data, adds relative features and makes selections, a second model acts as a gate and decides the picks. Not going to go into details.
I validated with a full walk forward fold from 2017-2024. I then introduced the BSP price after the walkthrough and found that there was a positive ROI, low drawdown and high enough frequency of picks.
I’ve done multiple break tests, checked for leakage.. and I can’t find anything wrong
Overall ROI across 9 years is 13.6%. Ranges from 7.8% to 19%(Covid year)…
I’m now at the point where im starting to test on new races, tiny stakes, logging everything. I expect to have 150 or so bets a month.. so perhaps in 6 months I’ll have enough data to verify the model truly works..
But there is a massive temptation to scale sooner. I’m scratching my head trying to find out what I’ve done wrong because I can’t quite believe I’ve found an edge.
Does anyone have advice? Would like to hear from anyone here who is building models
