The Folly Of Backtests: New Academic Study


Another thing I must point out is that you cannot prove a vague theory wrong. […] Also, if the process of computing the consequences is indefinite, then with a little skill any experimental result can be made to look like the expected consequences.—Richard Feynman [1964]

A backtest is a historical simulation of an algorithmic investment strategy. Among other things, it computes the series of profits and losses that such strategy would have generated had that algorithm been run over that time period. Popular performance statistics, such as the Sharpe ratio or the Information ratio, are used to quantify the backtested strategy’s return on risk. Investors typically study those backtest statistics and then allocate capital to the best performing scheme.

Regarding the measured performance of a backtested strategy, we have to distinguish between two very different readings: in-sample (IS) and out of- sample (OOS). The IS performance is the one simulated over the sample used in the design of the strategy (also known as “learning period” or “training set” in the machine-learning literature). The OOS performance is simulated over a sample not used in the design of the strategy (a.k.a. “testing set”).Abacktest is realistic when the IS performance is consistent with the OOS performance.

Here’s a round up of hedge funds’ May returns

InvestTyro Absolute Return Fund was down 1.5% for May. The fund's main contributors in May were Super Micro Computer, which gained 1.6%, Shyft Group, which was up 1%, and GCI Liberty, which gained 1%. Detractors in May include Recro Pharma, which fell 2.6%, index shorts and hedges, which declined 2%, and DXC Technology, which was Read More

When an investor receives a promising backtest from a researcher or portfolio manager, one of her key problems is to assess how realistic that simulation is. This is because, given any financial series, it is relatively simple to overfit an investment strategy so that it performs well IS.

Overfitting is a concept borrowed from machine learning and denotes the situation when a
model targets particular observations rather than a general structure. For example, a researcher could design a trading system based on some parameters that target the removal of specific recommendations that she knows led to losses IS (a practice known as “data snooping”). After a few iterations, the researcher will come up with “optimal parameters”, which profit from features that are present in that particular sample but may well be rare in the population.

Recent computational advances allow investment managers to methodically search through
thousands or even millions of potential options for a profitable investment strategy. In many instances, that search involves a pseudo-mathematical argument which is spuriously validated through a backtest. For example, consider a time series of daily prices for a stock X.

See Full PDF here: SSRN-id2308659

Via: papers.ssrn