**Why Risk Is So Hard To Measure**

London School of Economics – Systemic Risk Centre

De Nederlandsche Bank; Erasmus University Rotterdam (EUR) – Erasmus School of Economics (ESE)

December 30, 2015

*De Nederlandsche Bank Working Paper No. 494*

**Abstract: **

This paper analyses the accuracy and reliability of standard techniques for risk analysis used by the financial industry as well as in regulations. We focus on the difference between value-at-risk and expected shortfall, the small sample properties of these risk measures and the impact of using an overlapping approach to construct data for longer holding periods. Overall, we find that risk forecasts are extremely uncertain at low sample sizes. By comparing the estimation uncertainty, we find that value-at-risk is superior to expected shortfall and the time-scaling approach for risk forecasts with longer holding periods is preferable to using overlapping data.

### Why Risk Is So Hard To Measure – Introduction

Financial risk is usually forecasted with sophisticated statistical methods. However, in spite of their prevalence in industry applications and financial regulations, the performance of such methods is poorly understood. This is a concern since minor variations in model assumptions can lead to vastly different risk forecasts for the same portfolio, forecasts that are all equally plausible ex–ante. These results have problematic implications for many practical applications, especially where the cost of type I and type II error is not trivial.

Our aim in this paper is to analyze the most common practices in market risk modeling, identify under what conditions they deliver reliable answers, when and how they fail to live up to expectations and ultimately make recommendations to practitioners and regulators on their proper use. In particular, we focus on the three main challenges arise in the forecasting of financial risk: the choice of risk measure, data sample and statistical method.

A large number of statistical methods for forecasting risk have been proposed, but as a practical matter, only a handful have found significant traction, as discussed in Danielsson et al. (2015). Of these, all but one depend on some parametric model, while one, historical simulation (HS), is model independent. Our objective in this paper is not to compare and contrast the various risk forecast methods: after all, a large number of high–quality papers exist on this very topic. Instead, we want to see how a representative risk forecast method performs, identifying results that are related to the technical choice on the other two key issues: risk measure and data.

Given our objectives, it is appropriate to focus on HS, not only is it a commonly used method, for example, 60% of the US banks considered O’Brien and Szerszen (2014) use HS. More fundamentally, the good performance of a specific parametric model is usually driven by the fact that the model is close to the data generating process (DGP) and it is not possible to find a parametric model that performs consistently well across all DGPs. Although HS is the simplest estimation method, it has the advantage of not being dependent on a particular parametric DGP, and any other method would be biased towards its DGP, creating an unlevel playing field.

While our first contribution is the practical comparison of Value–at–Risk (VaR) to expected shortfall (ES).1 A common view holds that VaR is inherently inferior to ES, a view supported by three convincing arguments. First, VaR is not a coherent measure unlike ES, as noted by Artzner et al. (1999). Second, as a quantile, VaR is unable to capture the risk in the tails beyond the specific probability, while ES accounts for all tail events. Finally, it is easier for financial institutions to manipulate VaR than ES. Perhaps swayed the theoretical advantages, ES appears increasingly preferred both by practitioners and regulators, most significantly expressed in Basel III. While the Proposal is light on motivation, the little that is stated only refers to theoretic tail advantages. The practical properties of both ES and VaR, such as their estimation uncertainties, are less understood, especially since empirically there is no reason to believe that the preference ordering is as clear-cut. After all, implementation introduces additional considerations, some that work in opposite directions. The estimation of ES requires more steps and more assumptions than the estimation of VaR, giving rise to more estimation uncertainty. However, ES smooths out the tails and therefore might perform better in practice.

Our second contribution is to investigate how best to use data. Many applications require the estimation of risk over multiple day holding horizons. There are three ways one can calculate such risk, use non–overlapping data, use overlapping data, or timescale daily risk forecasts. While the first is generally preferred, it may not be possible because the lack of data.

In our third and final contribution we study whether the estimation of risk measures is robust when considering small — and typical in practical use — sample sizes. Although the asymptotic properties of risk measures can be established using statistical theories, known asymptotic properties of the risk forecast estimators might be very different in typical sample sizes. Estimation uncertainty based on asymptotic theory can therefore be misleading for small sample size analysis.

See full PDF below.