The Effects Of Twitter Sentiment On Stock Price Returns

Gabriele Ranco

IMT Institute for Advanced Studies, Piazza San Francesco 19, 55100 Lucca, Italy,

Darko Aleksovski

Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia

Guido Caldarelli

IMT Institute for Advanced Studies, Piazza San Francesco 19, 55100 Lucca, Italy,

Istituto dei Sistemi Complessi (ISC), Via dei Taurini 19, 00185 Rome, Italy,

London Institute for Mathematical Sciences, 35a South St. Mayfair, London W1K 2XF, United Kingdom

Miha Grcar

Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia

Igor Mozetic

Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia

Abstract

Social media are increasingly reflecting and influencing behavior of other complex systems. In this paper we investigate the relations between a well-known micro-blogging platform Twitter and financial markets. In particular, we consider, in a period of 15 months, the Twitter volume and sentiment about the 30 stock companies that form the Dow Jones Industrial Average (DJIA) index.We find a relatively low Pearson correlation and Granger causality between the corresponding time series over the entire time period. However, we find a significant dependence between the Twitter sentiment and abnormal returns during the peaks of Twitter volume. This is valid not only for the expected Twitter volume peaks (e.g., quarterly announcements), but also for peaks corresponding to less obvious events. We formalize the procedure by adapting the well-known “event study” from economics and finance to the analysis of Twitter data. The procedure allows to automatically identify events as Twitter
volume peaks, to compute the prevailing sentiment (positive or negative) expressed in tweets at these peaks, and finally to apply the “event study” methodology to relate them to stock returns. We show that sentiment polarity of Twitter peaks implies the direction of cumulative abnormal returns. The amount of cumulative abnormal returns is relatively low (about 1–2%), but the dependence is statistically significant for several days after the events.

The Effects Of Twitter Sentiment On Stock Price Returns – Introduction

The recent technological revolution with widespread presence of computers and Internet has created an unprecedented situation of data deluge, changing dramatically the way in which we look at social and economic sciences. The constantly increasing use of the Internet as a source of information, such as business or political news, triggered an analogous increasing online activity. The interaction with technological systems is generating massive datasets that document collective behavior in a previously unimaginable fashion. Ultimately, in this vast repository of Internet activity we can find the interests, concerns, and intentions of the global population with respect to various economic, political, and cultural phenomena.

Twitter

Among the many fields of applications of data collection, analysis and modeling, we present here a case study on financial systems.We believe that social aspects as measured by social networks are particularly useful to understand financial turnovers. Indeed, financial contagion and, ultimately, crises, are often originated by collective phenomena such as herding among investors (or, in extreme cases, panic) which signal the intrinsic complexity of the financial system. Therefore, the possibility to anticipate anomalous collective behavior of investors is of great interest to policy makers because it may allow for a more prompt intervention, when appropriate.

State-of-the-art. We briefly review the state-of-the-art research which investigates the correlation between the web data and financial markets. Three major classes of data are considered: web news, search engine queries, and social media. Regarding news, various approaches have been attempted. They study: (i) the connection of exogenous news with price movements, (ii) the stock price reaction to news; (iii) the relations between mentions of a company in financial news, or the pessimism of the media, and trading volume; (iv) the relation between the sentiment of news, earnings and return predictability, (v) the role of news in trading actions, especially of short sellers; (vi) the role of macroeconomic news in stock returns; and finally (vii) the high-frequency market reactions to news. There are several analyses of search engine queries. A relation between the daily number of queries for a particular stock, and daily trading volume of the same stock has been studied by. A similar analysis was done for a sample of Russell 3000 stocks, where an increase in queries predicts higher stock prices in the next two weeks. Search engine query data from Google Trends have been used to evaluate stock riskiness. Some other authors used Google trends to predict market movements. Also, search engine query data have been used as a proxy for analyzing investor attention related to initial public offerings (IPOs).

Regarding social media, Twitter is becoming an increasingly popular micro-blogging platform used for financial forecasting. One line of research investigates the relation between the volume of tweets and financial markets. For example, studied whether the daily number of tweets predicts the S&P 500 stock indicators. Another line of research explores the contents of tweets. In a textual analysis approach to Twitter data, the authors find clear relations between the mood indicators and Dow Jones Industrial Average (DJIA). In, the authors show that the Twitter sentiment for five retail companies has statistically significant relation with stock returns and volatility. A recent study compares the information content of the Twitter sentiment and volume in terms of their influence on future stock prices. The authors relate the intra-day Twitter and price data, at hourly resolution, and show that the Twitter sentiment contains significantly more lead-time information about the prices than the Twitter volume alone. They apply stringent statistics which require relatively high volume of tweets over the entire period of three months, and, as a consequence, only 12 financial instruments pass the test.

Motivation. Despite the high quality of the datasets used, the level of empirical correlation between stock price derived financial time series and web derived time series remains limited, especially when a textual analysis of web messages is applied. This observation suggests that the relation between these two systems is more complex and that a simple measure of correlation is not enough to capture the dynamics of the interaction between the two systems. It is possible that the two systems are dependent only at some moments of their evolution, and not over the entire time period.

Twitter

Twitter

See full PDF below.