The following analysis is in response to a research request that wanted to know if we had tried Renko bars instead of Range bars. The test produced some very interesting results, so it was prioritized, but it also kicked off a complete overhaul of our systems and process (more on this later).
Renko Vs. Range
Renko and Range are two types of bar calculations. We were drawn to this question because like Range, Renko is a bar calculation that is independent of time. In Range bars, the high and low are the most important price levels; however, in Renko, closing prices are the most important. In general, Renko bars are used to smooth out noise and keep you in a trending market for a longer period of time. You might even use Renko to spot a trend and Range to trade on it.
Here’s a look at the difference between the two charts across the same time period:
Renko
Range
So we created backtests for all strategies using Renko 36 as the data series. We chose 36 since it’s the bar series for Range (we did not optimize on the Renko data series).
These are the results.
The first table below is for Renko, which includes market replay results from 5/10 - 6/25, and the second table is for Range. We only created Renko backtests on Strategies 1-15.
Renko Results - 5/3/2020 - 5/3/2021
Range Results - 5/3/2020 - 5/3/2021
Key takeaways,
Backtest results for Renko were better on almost all strategies except 7 & 8. When we say better, we mean higher net profitability, higher profit factors and lower drawdowns.
For example, net income for Strategy 3 was $834K with a 1.74 profit factor and a max drawdown of only -2.64% for the year. Net income for Strategy 10 was $495K with a 2.8 profit factor and a drawdown of less than -.92%.
These are amazing results (almost holy grail worthy), which made us hopeful, but skeptical. The next step was to run through market replay (see the Profit column for each strategy over the period 5/10 - 6/25 -- at the end of the Renko table).
What did market replay tell us?
While it may seem that most strategies performed well over the market replay test period, in the end, all strategies except Strategy 6 and 8 remained negative until 6/17. That is, only Strategy 6 and 8 had the consistent performance we expected to see due to backtest results.
It is important to note the degree of the discrepancy however -- Strategy 8 went from having the worst backtest results to one of the best market replay results. Strategy 8 continued to build slowly throughout the period, which is what we like to see.
Even though the market replay results were only for a short period of time compared to backtest results, and ultimately proved to be successful, we continue to be deeply concerned about the accuracy of backtests. Backtests are our first line of analysis, so if they are wrong, we're wasting a lot of time. To that end, we've decided to prioritize the accuracy of backtest results over the last two weeks. And, I’ll detail exactly what that means in a follow up post tomorrow.
Final takeaway: The best way to improve backtest accuracy is to improve backtest/historical data. In particular, it is the calculation of historical data that must be dealt with. After much research, these are the most salient points regarding the calculation of historical data.
In NT7, all backtest data is calculated on bar close even if the strategy is set to calculate on each tick.
The previous bar only contains a limited set of data — open, high, low, close, (OHLC).
Our research shows that this methodology is not specific to NinjaTrader. The most popular method for developers to introduce a backtesting tool is to use OHLC candlestick results because it’s a smaller data set. In other words, only the OHLC of the bar is known and not each tick that made up the bar. Indeed, we thought that comparing backtest data from three separate platforms would add validity/accuracy to our data, but have found that all three platforms are programmed to use the same methodology.
This does not mean that backtests that use limited data are irrelevant, but calculated bars like Renko, Range, Point and Figure, and Heikin-Ashi are going to be more prone to large discrepancies due to their inherent nature in bar formation/calculation. Likewise, long range data series (i.e., over 5 minutes, over 18 range, over 1000 tick) are also going to be more prone to large discrepancies.
Testing a strategy in Market Replay data is the most accurate way to assess the performance of a strategy as Market Replay plays back the market action just as it had occurred. The issue is that Market Replay does not allow for optimization and is slower than running a backtest.
Next steps: We are in the process of making some drastic changes
We're going to upgrade to NT8 over the next month because it gives us the ability to add intra-bar granularity by adding a second data series (such as a 1 tick or 1 second series using AddDataSeries) so that the strategy or indicator contains the individual ticks/seconds in the backtest/historical data for use in the primary series. This allows for more granularity and increased accuracy by supplying the correct price at the correct time for the order to fill with. We had planned on making the move to NT8 at the beginning of 2021, but the discrepancy between backtest results and market replay data while testing Renko bars has created a greater urgency. The September update will be based on NT8. For the July 7 update, we will provide updated backtest results for comparison, however, we will also provide two months of market replay data.
As a point of clarity, I want to stress, we like Renko bars. In particular, we like Strategy 8 and Strategy 6. We’re running them both as Renko charts (simulated live) and they are both performing exceptionally well. We'll keep you posted on this.
In addition, on the whole, Renko bars produced better results than Range bars did over the same time period, but we also know that the backtest results were better than the market replay results, as data point extremes were hit within the market replay time period, which was much shorter than the backtest period.
We may be making a big deal out of nothing here, but given the implications we are making it a priority because we never want to be skeptical of backtest results again. Indeed, we want to feel confident about our methodology, otherwise, what’s the point? So, we’re dealing with the issue of backtest accuracy now and hope to run some tests on our new set up by the end of July. All of the changes we’re making will be detailed in a follow up post to all subscribers coming out shortly.
Look for an update on Research Requests to be posted with the July 7 update.