Hey all i'm an experienced programmer and mathematician, but i'm at a loss about how one actually starts trading on betfair. I stupidly built some infra for betfair with the intention of collecting some l2 data for 6 months ish and then starting to trade in anger. This was clearly a mistake, as I was declined for the prod API key and directed to the historical data purchase to parameterise/research a model first.
This is understandable from their POV as I guess they have their own infra costs, but it seems like a bit of an unnecessary barrier to entry at £250/month of historical l2 data. I'd be interested to know what other people did before starting out, as I'm unsure what is the best approach. I could deploy some sort of low risk algo to generate some monthly volume very quickly, and then carry on with my plan, but this feels like the wrong way forward. Is there a minimum volume they expect you to generate before being granted a key - I assume so or I don't know why they would ask that question in their signup form. Did other people just buy historical data and then launch? From my perspective i'm just trying to derisk any up front costs if I don't even know I can find an edge.
Thanks for taking the time to read this!
How do I actually get started
At a basic level you might want to have a historical dataset. Maybe try racing-bet-data.com - it took me a while but over time I managed to download most of the historical data and I now have my own local DB with results that I am able to query easily. The service also produces a daily results spreadsheet with in_play data that you can import into your DB. You might find different data providers or try to scrape the data yourself, but scraping is not as easy as it used to be for huge historical data volume...
For Market data pre_off and Volume data in_play you will need to get "dirty" with excel and betangel or even using the betangel API (i have not done this but have read users loving the new API integration betangel has).
Other than that you will need to put hours, hours and hours of work into this to come to a point where things will start clicking a bit. I am 6 months into the project and just starting to break even, which Peter mentions as being the first objective....
Other more experienced users might have other suggestions...
For Market data pre_off and Volume data in_play you will need to get "dirty" with excel and betangel or even using the betangel API (i have not done this but have read users loving the new API integration betangel has).
Other than that you will need to put hours, hours and hours of work into this to come to a point where things will start clicking a bit. I am 6 months into the project and just starting to break even, which Peter mentions as being the first objective....
Other more experienced users might have other suggestions...
I think this chap is looking more for the granularity of level two data, i.e., serious data rather than just top-level stuff.
Using a bit of vibe coding, you can get some really good data analysis on standard data. But you don't need to pay for it because it's available for free: -
https://promo.betfair.com/betfairsp/prices
Using a bit of vibe coding, you can get some really good data analysis on standard data. But you don't need to pay for it because it's available for free: -
https://promo.betfair.com/betfairsp/prices
On the question of detailed level two data and other stuff like that, you're just going to have to pay for it if you want it.
I've always just put things into the market and learned that way. I've never really paid or analysed lots of historic data. Actually, doing things finds you a lot of interesting strategies without the expense.
I've always just put things into the market and learned that way. I've never really paid or analysed lots of historic data. Actually, doing things finds you a lot of interesting strategies without the expense.
"L2 data in a betting exchange context means the full depth of the order book — not just best available back/lay prices, but multiple price levels deep on both sides, showing the volume queued at each tick. Think of it like L2 market data in equities (the full ladder).
On Betfair specifically, this means seeing the full price ladder with stakes available at each odds level, not just the top 3 back/lay prices you get from the basic API. It lets you see where liquidity is stacked, identify support/resistance in the book, spot spoofing, etc.
Your approach wasn't stupid — collecting live L2 data over time to study microstructure before trading is sound in principle. The problem is Betfair gates prod API access and pushes you toward buying their historical data packages first, which contain exactly this kind of depth-of-book data already recorded. They want you to prove you have a viable strategy (and pay for data) before giving you the firehose.
So the practical path is: buy their historical data, do your research/backtesting with that, then apply for prod access with evidence of a real strategy."
YEP, my advice was clearly useless LOL ! Level 2 data, uhhhmmm !
On Betfair specifically, this means seeing the full price ladder with stakes available at each odds level, not just the top 3 back/lay prices you get from the basic API. It lets you see where liquidity is stacked, identify support/resistance in the book, spot spoofing, etc.
Your approach wasn't stupid — collecting live L2 data over time to study microstructure before trading is sound in principle. The problem is Betfair gates prod API access and pushes you toward buying their historical data packages first, which contain exactly this kind of depth-of-book data already recorded. They want you to prove you have a viable strategy (and pay for data) before giving you the firehose.
So the practical path is: buy their historical data, do your research/backtesting with that, then apply for prod access with evidence of a real strategy."
YEP, my advice was clearly useless LOL ! Level 2 data, uhhhmmm !
