Just chipping in here. This was an area i was very interested in. So there is sectional data available, however across the UK and Ireland there isn’t 100% coverage of complete sectional data.
I have models that use this sectional data, past performance data and also data derived from comments etc - but it’s not as simple as pointing to one runner in every race that’s going to lead - you have to use the data in a relational way between the runners to try and get an idea of who might lead.
