Just on the above as well, I have further questions. So having read through a lot of this forum trying to develop an edge into the markets with automation I have taken the following:
- Begin collecting data - 6 months seems to be the accepted minimum amount on here (to be split so strategies can be developed on one and tested on others).
- While data is collecting, deposits can be made, and even with the strict KYC checks, 500 a month should be achievable.
- 6 months in and you now have 6 months of data to go at and a balance of 3k.
- Non-live mode sounds like a waste of time for automation and better starting out with smaller stakes?
- At this point, you arent any further forward really other than having money and data to play with. This is where you begin to develop your strategies. Obviously this is the big grey area. How would knowing the markets are efficient (EMH) help predict small price movements? Or is EMH just how we measure things? I asked ChatGPT for example response data from betfair streaming API but I was unable to wrap my head around how recording this would help, as you still need to see something in it to go at it with?
One other question I have relates to the message above from Shaun too. Strategies come and go so let's say you have a strategy, it is profitable on your 3 months of data from April-June, but when testing to make sure you havent backfitted it against your Jan-Mar data it is not profitable. What would you do in this instance where this could be a strategy that makes money again next 3 months or it could equally be a dud thats just been backfitted?
I appreciate there is not some kind of checklist to this but I am just trying to pull together the advice given all over before diving into this. I am currently not collecting data as to be honest I dont see the point until I have something to go at this data with. My presumptions are that most the auto guys had ideas from a manual perspective before going this route?
Developing a New Edge
- wearthefoxhat
- Posts: 3558
- Joined: Sun Feb 18, 2018 9:55 am
+1ShaunWhite wrote: ↑Mon Jul 07, 2025 7:36 pm
You never sit back because edges can fade without warning, and multiple strats also smooth the variance.
That happened to one of mine a few years back.
Got one chipping away at the moment that works okay on the 4 or 5 runner races. (Overround reduced..etc)
Probably better to just make a new thread for all thatcsewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pmJust on the above as well, I have further questions. So having read through a lot of this forum trying to develop an edge into the markets with automation I have taken the following:
- Begin collecting data - 6 months seems to be the accepted minimum amount on here (to be split so strategies can be developed on one and tested on others).
- While data is collecting, deposits can be made, and even with the strict KYC checks, 500 a month should be achievable.
- 6 months in and you now have 6 months of data to go at and a balance of 3k.
- Non-live mode sounds like a waste of time for automation and better starting out with smaller stakes?
- At this point, you arent any further forward really other than having money and data to play with. This is where you begin to develop your strategies. Obviously this is the big grey area. How would knowing the markets are efficient (EMH) help predict small price movements? Or is EMH just how we measure things? I asked ChatGPT for example response data from betfair streaming API but I was unable to wrap my head around how recording this would help, as you still need to see something in it to go at it with?
One other question I have relates to the message above from Shaun too. Strategies come and go so let's say you have a strategy, it is profitable on your 3 months of data from April-June, but when testing to make sure you havent backfitted it against your Jan-Mar data it is not profitable. What would you do in this instance where this could be a strategy that makes money again next 3 months or it could equally be a dud thats just been backfitted?
I appreciate there is not some kind of checklist to this but I am just trying to pull together the advice given all over before diving into this. I am currently not collecting data as to be honest I dont see the point until I have something to go at this data with. My presumptions are that most the auto guys had ideas from a manual perspective before going this route?
This one has been renamed for a different purpose and isn't even listed under Sports Trading Topics
- ShaunWhite
- Posts: 10447
- Joined: Sat Sep 03, 2016 3:42 am
As Kai said start a new thread, lots of observations from that I'd reply to when I'm at my pc tonight.
- ShaunWhite
- Posts: 10447
- Joined: Sat Sep 03, 2016 3:42 am
Yes. From day 1csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pm
- Begin collecting data - 6 months seems to be the accepted minimum amount on here (to be split so strategies can be developed on one and tested on others).
I think it's 150/mo without a check but, 3 grand is a lot, 250-500 would be enoughcsewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pm- While data is collecting, deposits can be made, and even with the strict KYC checks, 500 a month should be achievable.
- 6 months in and you now have 6 months of data to go at and a balance of 3k.
It weeds out the total rubbish and makes sure the rare edge cases are handled, and on small stakes the issue with fills isn't so bad it tells you nothing.csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pm- Non-live mode sounds like a waste of time for automation and better starting out with smaller stakes?
But what you don't get a matched bet history to look into later which is the valuable stuff.
...aka the hard bitcsewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pm- At this point, you arent any further forward really other than having money and data to play with. This is where you begin to develop your strategies. Obviously this is the big grey area.
EMH is the therory that markets are always efficient at mid price (back price + lay price )/2. And that betting blind thousands of times would return a gross zero. The exchange is a 'zero sum game'.csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pmHow would knowing the markets are efficient (EMH) help predict small price movements? Or is EMH just how we measure things?
But that's just on avg and at large scale, and every individual market is almost certainly wrong all the time. It's that difference between 'right' and the market price you get that's called EV (expected value) and it's that people measure. But you know your bet price but not necessarily what the 'right' price was.....and that's why people often use BSP as a known 'right price'. Which it is, but again only as an avg not necessarily each one. But it's close and although EMH tells us it's not more 'right' than any other time, it varies the least from true price so it's useful as a benchmark.
The advantage with api data is the depth and resolution. Every market update, containing every ladder step + other info about the market and runners 247365. As many sports as you want it's all the same format.csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pmI asked ChatGPT for example response data from betfair streaming API but I was unable to wrap my head around how recording this would help, as you still need to see something in it to go at it with?
First it would have to work on any data i had before i went live.csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pmOne other question I have relates to the message above from Shaun too. Strategies come and go so let's say you have a strategy, it is profitable on your 3 months of data from April-June, but when testing to make sure you havent backfitted it against your Jan-Mar data it is not profitable. What would you do in this instance where this could be a strategy that makes money again next 3 months or it could equally be a dud thats just been backfitted?
But you can do that splitting your data a different way. Using Jan-Mar and then Arpil-Jun isn't a true comparison because the markets can evolve and change over time. It's better to take that 6 months of data and randomly split it maybe by alternate days or markets with an even/odd number etc. And you'd look for a steady climb on both your training set and your validation set. But both sets have any 'over time' changes built into them because they cover the same time period.
If you wait until you have an idea before you collect data you'll have nothing to look at. Start collecting data yesterday and collect anything and everything because you can't tell what you be doing in a few years and what you might need.csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pmI appreciate there is not some kind of checklist to this but I am just trying to pull together the advice given all over before diving into this. I am currently not collecting data as to be honest I dont see the point until I have something to go at this data with.
Varies a lot, in my experience most of the people into automation seriously have never even seen a ladder. They just analyse data and to them it's a giant sudoku puzzle.csewell1987 wrote: ↑Wed Jul 09, 2025 2:35 pmMy presumptions are that most the auto guys had ideas from a manual perspective before going this route?
Manual experience can help but it can also hinder, automation often executes manual trading styles pretty badly because it can't apply any subjectivity. So think about what manual traders can't do that automation can and that's looking at every selection in every market 247365 and doing thousands of small margin, small stakes bets
But I see the bigger question, so I've got some data, how what?
-
- Posts: 26
- Joined: Wed Feb 14, 2024 1:51 pm
Cheers Shaun, really helpful, thanks for taking the time to answer. I guess the last question still remains in my mind but certainly a lot to digest there. I take your point too, if I do think of a way to go at the data then I need some to go at. I guess one final question is would you say you found your edge once you had the data in the data or did you already have an idea when you started collecting of what you were going to do?
- ShaunWhite
- Posts: 10447
- Joined: Sat Sep 03, 2016 3:42 am
No idea at all. And I started collecting it before I even started work on my simulator, and only when that was built I started to think about the things I could try with it. It was close to a year from starting the project to strategy hunting. Then about 6 months before the first strategy launch which went well. (and that first strategy latest about 4yrs before it started to fail in a way that couldn't be tweaked back to life)csewell1987 wrote: ↑Wed Jul 09, 2025 11:50 pmCheers Shaun, really helpful, thanks for taking the time to answer. I guess the last question still remains in my mind but certainly a lot to digest there. I take your point too, if I do think of a way to go at the data then I need some to go at. I guess one final question is would you say you found your edge once you had the data in the data or did you already have an idea when you started collecting of what you were going to do?
But I'm still finding new uses for the OTT data I've been collecting for yrs that weren't around at the time, 7yrs, 7TB and 7 billion market change records are the perfect fuel for deep learning which has now become accessible. Wasn't even possible at the start but always knew tech would get there eventually and it would be useful.
So everyone is different but collect data from day 1 (any and all). And then while it's accumulating start to think about how you'll use it. That will inform the type of strategies you'll think about rather than having an idea first and building around just that. And tbh it's a numbers game, you might try 20 ideas before you find a good one. So the key feature of your system has to be setting up and testing ideas thoroughly but quickly. Fail fast.
(building a system needn't take a year or more, I did it that way because I didn't want to use the freeware that makes collection and backtesting fairly easy. And I was fortunate to have had a career designing and building trading software so I wanted to structure it the way I understood. And it was a challenge. Some people using the freeware are up and running in a couple of months)
'System' sounds fancy but it's just any pipeline that performs that collect/analyse/backtest/deploy function.