I don't think I'm alone on this ... you consider a strategy, you set it out for backtesting, you run your backtest and it shows a stupidly positive (or negative) return. Check your code, can't see an error. Must be something is the backtest, run the test again. Same result. Must be the settings, can't see an error. Run the test again. Same result again. Think about backtesting is not a live market so maybe that's it? Can't be ... no way can it affect the test to that extent. But the return is 'impossible', no way. It's such a simple strategy that it would be more difficult to get the code wrong than correct. Run it again. Same result. Try changing some of the variables and run the code again ... same result (within the scope of the adjusted variables). You check a random sample of individual 'events', all correct.
I'm sure I'm not the only one to have come across this and it's not my first time and I admit to usually finding an error somewhere. I just wondered what others do when they don't believe their backtest results.
I just don't believe it!
- jamesedwards
- Posts: 5169
- Joined: Wed Nov 21, 2018 6:16 pm
Set it running at very low stakes.
There's nothing like a test in live environment to weed out the wheat from the chaff!
There's nothing like a test in live environment to weed out the wheat from the chaff!
- ShaunWhite
- Posts: 10672
- Joined: Sat Sep 03, 2016 3:42 am
Definately been there and it's always a bug, even if it makes you scratch your head.
Are you sure it's not hindsighting, like using a price or sp that hasn't happened yet?
Maybe the code is right but you're looking at the log wrong, like using -1 for lay loss not price-1 (to a pound stake). If its the same on different sets (or subsets) of your data then I'd say def a bug.
Are you able to step through the code in debug mode and check it line by line that way? Maybe a variable isn’t set how you think/assume it is, like a null?
The good news is that you know what's too good to be true, the problem ones are where it's wrong, but it's plausible.
Are you sure it's not hindsighting, like using a price or sp that hasn't happened yet?
Maybe the code is right but you're looking at the log wrong, like using -1 for lay loss not price-1 (to a pound stake). If its the same on different sets (or subsets) of your data then I'd say def a bug.
Are you able to step through the code in debug mode and check it line by line that way? Maybe a variable isn’t set how you think/assume it is, like a null?
The good news is that you know what's too good to be true, the problem ones are where it's wrong, but it's plausible.
- firlandsfarm
- Posts: 3517
- Joined: Sat May 03, 2014 8:20 am
Thanks for your input guys. It's actually a spread betting Forex thing.ShaunWhite wrote: ↑Tue Jan 06, 2026 6:25 pmAre you sure it's not hindsighting ...
Maybe the code is right but you're looking at the log wrong ...
Hindsighting (back fitting) not involved ... it's the first pass on the data so there's no filtering other than "no new position until any existing position is closed"
No log as such ... spreadsheet of Forex Day prices from 02/01/12 to date so 3654 data records per pair (all 28 major pairs) so lots of data ... maybe I should split it into two data sets (odd ID/even ID) and see how the return splits. It's a simple "Open Long if this ..." and "Hold or Close as SL/TP if that ..." If not Hold then repeat. The data is Excel Stockhistory() download, guess I should repeat the test against a download of the broker's data. Currently structured as one sheet per pair and one row of data per page (Daily price, Open, High, Low, Close). I know something is not correct but can't see where ... if not then expect to see me with my new Roller later in the year!
As James says if I can't find the error then play it for real with small stakes and see what happens.
One thing I always remind myself is that backtests do not understand price, they only understand assumptions about price. You can have a perfectly sensible idea, but if the model is assuming fills that would never realistically exist in a live market, the results can quickly drift into fantasy. Ask yourself a simple question. Would I actually get this price, at this time, with this size, day after day? If the honest answer is no, the backtest is already compromised.
Another useful trick is to try and break the strategy deliberately. Add friction, delays, worse prices, or restrict entries so it should obviously perform worse. If the returns barely change, something is leaking through the logic. Equally, flip it on its head. If the reverse of the strategy also makes money, that is not an edge, that is a structural issue in how profit and loss is being calculated.
Reducing the data is also crucial. Not a random handful of trades, but a short, continuous period that you can walk through candle by candle or tick by tick. I still do this by hand when something does not feel right. Most errors live in the boring places.
Changing the data source is another big one. Same logic, different feed. If the performance disappears, the problem is often alignment, timing or how prices are sampled. People underestimate how sensitive strategies are to tiny shifts in when a price is observed.
And finally, nothing beats forward testing at very small stakes. Not to prove it makes money, but to see if it behaves the way the backtest claims.
Another useful trick is to try and break the strategy deliberately. Add friction, delays, worse prices, or restrict entries so it should obviously perform worse. If the returns barely change, something is leaking through the logic. Equally, flip it on its head. If the reverse of the strategy also makes money, that is not an edge, that is a structural issue in how profit and loss is being calculated.
Reducing the data is also crucial. Not a random handful of trades, but a short, continuous period that you can walk through candle by candle or tick by tick. I still do this by hand when something does not feel right. Most errors live in the boring places.
Changing the data source is another big one. Same logic, different feed. If the performance disappears, the problem is often alignment, timing or how prices are sampled. People underestimate how sensitive strategies are to tiny shifts in when a price is observed.
And finally, nothing beats forward testing at very small stakes. Not to prove it makes money, but to see if it behaves the way the backtest claims.
- firlandsfarm
- Posts: 3517
- Joined: Sat May 03, 2014 8:20 am
Thanks Peter, good food for thought. When it comes to checking the detail of each datarow I've done that for different scenarios taking about 10 examples each time but it checks out each time.Euler wrote: ↑Wed Jan 07, 2026 5:26 amOne thing I always remind myself is that backtests do not understand price, they only understand assumptions about price. You can have a perfectly sensible idea, but if the model is assuming fills that would never realistically exist in a live market, the results can quickly drift into fantasy. Ask yourself a simple question. Would I actually get this price, at this time, with this size, day after day? If the honest answer is no, the backtest is already compromised.
Another useful trick is to try and break the strategy deliberately. Add friction, delays, worse prices, or restrict entries so it should obviously perform worse. If the returns barely change, something is leaking through the logic. Equally, flip it on its head. If the reverse of the strategy also makes money, that is not an edge, that is a structural issue in how profit and loss is being calculated.
Reducing the data is also crucial. Not a random handful of trades, but a short, continuous period that you can walk through candle by candle or tick by tick. I still do this by hand when something does not feel right. Most errors live in the boring places.
Changing the data source is another big one. Same logic, different feed. If the performance disappears, the problem is often alignment, timing or how prices are sampled. People underestimate how sensitive strategies are to tiny shifts in when a price is observed.
And finally, nothing beats forward testing at very small stakes. Not to prove it makes money, but to see if it behaves the way the backtest claims.
I think I may have established something incorrectly allowing for the spread with the buy and sell prices. I find I get brain fog when working for a long time with lots of data in one sitting, it's probably an age thing.
When I've reworked my application of the price spread I think I will create an MT5 EA and run it through the MT Strategy and maybe also ask AI to check my spreadsheet against the strategy.
Sometimes it helps to just put a problem into words rather than just thought.
- ShaunWhite
- Posts: 10672
- Joined: Sat Sep 03, 2016 3:42 am
Terminology issues: Using something like Excel on static data is Data Analysis. A "backtest", is a time series simulation that accurately considers PIQ, fill rate, available volume and latency. Analysis sorts the wheat from the chaff and a backtest answers the "would it work" questions.Euler wrote: ↑Wed Jan 07, 2026 5:26 amOne thing I always remind myself is that backtests do not understand price, they only understand assumptions about price. You can have a perfectly sensible idea, but if the model is assuming fills that would never realistically exist in a live market, the results can quickly drift into fantasy. Ask yourself a simple question. Would I actually get this price, at this time, with this size, day after day? If the honest answer is no, the backtest is already compromised.
Ideally people do analysis, and then a backtest. But if people don't have backtesting facilities then they have to run a live trial instead. But that can be inconclusive and time consuming when run for several weeks whereas a backtest is run on several years of data relatively quickly. But it's obviously harder to do.
'Backtest' has become shorthand for 'look at some old data' and carries all the risks and caviats you've described but a backtest in the true sense of the world address most of those. They're different tasks that answer different questions and conflating the two can mean analysis gets misinterpreted as a test of likely profitability and when it doesn't match live, then backtesting gets a bad name because that's what they think they did.
I write all that at least once a year and I'm getting it done early this year
- firlandsfarm
- Posts: 3517
- Joined: Sat May 03, 2014 8:20 am
Shaun, on a "Terminology issue" if I'm using historical data in Excel (i.e. I'm looking "backwards" in time) and I'm using those backward observations to test a strategy then you will have great difficulty in convincing me I'm not backtesting! ... the clue is in the use of the words "back" and "test"! And for the purposes of what I was testing I really don't think my couple of thousands in one of the major currency pair's market will have any affect on the trillions matched daily. "PIQ, fill rate, available volume and latency" don't really come into it. I am a great believer in KISS ... why complicate something unnecessarily?ShaunWhite wrote: ↑Wed Jan 07, 2026 2:03 pmTerminology issues: Using something like Excel on static data is Data Analysis. A "backtest", is a time series simulation that accurately considers PIQ, fill rate, available volume and latency. Analysis sorts the wheat from the chaff and a backtest answers the "would it work" questions.
Ideally people do analysis, and then a backtest. But if people don't have backtesting facilities then they have to run a live trial instead. But that can be inconclusive and time consuming when run for several weeks whereas a backtest is run on several years of data relatively quickly. But it's obviously harder to do.
'Backtest' has become shorthand for 'look at some old data' and carries all the risks and caviats you've described but a backtest in the true sense of the world address most of those. They're different tasks that answer different questions and conflating the two can mean analysis gets misinterpreted as a test of likely profitability and when it doesn't match live, then backtesting gets a bad name because that's what they think they did.
I write all that at least once a year and I'm getting it done early this year![]()
- ShaunWhite
- Posts: 10672
- Joined: Sat Sep 03, 2016 3:42 am
Sure, you only need to test for the questions you're asking. If fill rates and liquidity etc aren't an issue then no need to test for it.
But it comes down to the type of test. My definition of backtest is to test the execution. Yours is to test the hypothesis. So yes they're both backtesting but in a methodology where you might do the two stages seperately (or skip the 2nd) then calling them both backtesting can be a bit confusing.
It's not derogatory, you always need to test the hypothesis and quite often the execution test isn't feasible or applicable. For simplicity (a fan too) I keep my project template the same and depending on the job I might skip backtesting (as n/a) , but I don't rename the hypothesis test as 'backtest'.
Thank goodness we only have ourselves to please and aren't managing a team where terminology matters.
But it comes down to the type of test. My definition of backtest is to test the execution. Yours is to test the hypothesis. So yes they're both backtesting but in a methodology where you might do the two stages seperately (or skip the 2nd) then calling them both backtesting can be a bit confusing.
It's not derogatory, you always need to test the hypothesis and quite often the execution test isn't feasible or applicable. For simplicity (a fan too) I keep my project template the same and depending on the job I might skip backtesting (as n/a) , but I don't rename the hypothesis test as 'backtest'.
Thank goodness we only have ourselves to please and aren't managing a team where terminology matters.
- firlandsfarm
- Posts: 3517
- Joined: Sat May 03, 2014 8:20 am
ShaunWhite wrote: ↑Wed Jan 07, 2026 4:30 pmThank goodness we only have ourselves to please and aren't managing a team where terminology matters.
