**Part 1: Deep Learning And Long-Term Investing by John Alberg and Michael Seckler – Euclidean Technologies Management**

Seventy-five years ago, Benjamin Graham – the father of security analysis – wrote that in the short run the market behaves like a voting machine, but over the long run it more closely resembles a weighing machine. Graham’s point was that fear, greed, and other emotions (the voting machine) can drive short-term market fluctuations which in turn cause disconnects between the price and true value of a company’s shares. Over long periods of time, however, the weighing machine kicks in as a company’s fundamentals ultimately cause the value and market price of its shares to converge.

Traditionally, investors have performed long-term fundamental analysis by studying the income statements, balance sheets, and other publicly available information about a company’s operations. Then, they use this information in the context of the company’s market value to make an informed decision about its prospects as a long-term investment.

The automation of this process, *systematic value investing*, has become possible with the emergence of high-quality data on company fundamentals and the ever-increasing computational power available to researchers. The attractiveness of an automated approach is that rigorous statistical techniques can be applied to the assessment of thousands of opportunities, and that a systematic process can protect investors from well-documented behavioral biases that often detract from investment performance.

In a recent investor letter, we described why deep learning, and in particular recurrent neural networks, might be well suited to the application of long-term systematic value investing. This is the first in a series of blog posts that describes some of our explorations in this area.

### Background

Recent applications of deep learning and recurrent neural networks have resulted in better-than-human performance by computers in many domains. However, there has been very little work in the application of these technologies to investment management. Nonetheless, there are several reasons why deep learning might achieve better results than traditional statistical methods or non-deep machine learning approaches when applied to long-term investing. These reasons include:

- Machine learning approaches are typically structured in such a way that the goal is to predict something from a fixed number of inputs. However, in the investment world, the input data typically come in sequences (for example, how a company’s operating results evolve over time), and the distribution of investment outcomes are conditioned by the evolution of those sequences. Recurrent neural networks, which have claimed many successes in recent years, are designed precisely for this type of sequenced data.
- In the quantitative investment field, a great deal of effort is put into “factor engineering” – the process of determining which features of a company are most valuable to forecasting its future stock price. Deep learning provides the potential opportunity to let the algorithms discover the features based on raw financial data. That is, the “deep” in deep learning means that successive layers of a model are able to untangle important relationships in a hierarchical way from data as found “in the wild,” and these relationships may be stronger than the ones found via traditional approaches to factor engineering.
- Some of the greatest progress in deep learning has been in the area of text processing. This capability opens the door for the possibility of leveraging the enormous corpus of non-structured, qualitative textual data related to companies that can be found in SEC filings, news reports, blog posts, social media, and earnings transcripts.

In our research, we are exploring these types of opportunities created by deep learning.

### Why Long Term Investing?

When applying mathematics and technology to investing, there is a tendency to want to model and exploit short-term trading opportunities. So why have we directed our focus on long-term investing? While short-term trading opportunities allow a researcher to test the success of a model with greater frequency, the challenge with stock-trading strategies is that many exogenous factors can wreak havoc on a stock’s price over the course of a day, a week, or even a quarter. On the other hand, there is good statistical evidence that, over the long term, a company’s evolving fundamentals play a primary role in determining its market value. A prominent example comes from the Nobel laureate Robert Shiller, who showed that stock market prices are extremely volatile over short periods but somewhat predictable by their price-to-earnings over long periods.

### The Setup

In this project we used deep neural networks (a term we will use to refer to the class of neural networks that includes multi-layer perceptrons and recurrent neural networks) to predict how a stock will perform relative to the market over a one-year time horizon. Picking the time horizon is a process of finding the best balance between having a sufficient amount of data for learning and having a horizon that is sufficiently long term.

The amount of high-quality data that one can currently get on broad-market company fundamentals spans about 55 years (from 1960 to 2015). If we sought to predict stock performance over a five-year time horizon, then our learning process would have only 11 independent time periods. On the other hand, if our time horizon was one month, we would have 660 independent time periods. Although more independent periods are helpful to learning, there are reasons to believe that prices over very short periods of time are too influenced by exogenous factors for models based on fundamentals to have sufficient predictive power. So we seek a time horizon that is the smallest one over which successful learning can be achieved. We settled on a one-year horizon.

In this setup, we feed data to our models in time steps spaced by a one-month interval. Monthly is the highest granularity of quality pricing data for equities prior to 1983, so it is the basic time interval we use. The implication is that the model is asked, at each time step (month), to make a prediction about what will happen to the stock’s price 12 time steps (one year) in the future. Visually, it looks like this:

This setup is a bit unusual even for recurrent neural networks. Typically, we are trying to predict some outcome in the next time step, or we are trying to predict a terminal outcome. Nonetheless, it is not difficult to construct the data model for this approach. Specifically, we want to construct a set of sequences where each sequence represents the evolution of a company through time, and each element of a sequence represents a company/month combination (we call this a company-month). Furthermore, each company-month is tagged with an outcome, where the outcome is related to how the company’s stock price performed over the subsequent year.

In this research, we model the outcome in a very simple way. Specifically, if the change in price for a stock is greater than the median change in price for all stocks, we assign it an outcome of +1. Otherwise, the outcome is -1.

This model can be depicted visually as a table where each row is a company-month:

One obvious criticism of a simple two-class prediction is that we are not teaching the model to predict the degree of outperformance a stock might achieve. This is true, but we contend that: (1) this “lower bar” approach makes learning on long-term fundamental data much easier (achievable); and (2) we can still achieve very good investment performance. By training a model to predict the probability that a stock will be in the +1 category (outperform the median performing stock), we can use the output probabilities as a measure of confidence and then build investment portfolios made up of companies with high confidence in outperformance. Furthermore, this simple setup creates a starting point for more refined approaches.

In the next blog post (Part 2), we will look in detail at the data set we used to train this model.