Tweet Sentiments And Crowd-Sourced Earnings Estimates As Valuable Sources Of Information Around Earnings Releases

Jim Kyung-Soo Liew

Johns Hopkins University – Carey Business School

Shenghan Guo

Johns Hopkins University

Tongli Zhang

Johns Hopkins University

August 27, 2015


In this work we examine the confluence of two important financial social media databases — Estimize and iSentium. Both data capture “crowdsourced” information that has begun to appear increasingly more important for financial market research. In particular we investigate the event of the earnings announcement. First we confirm that crowdsourced/Estimize’s consensus earnings have slightly more accuracy than Wall Street’s consensus earnings and this has been robust over the past two years 2013-2014. Second, we document that the objectivity of the crowd has been one reason why it is more accurate. Wall Street’s consensus is biased due to the “lowballing” phenomenon pervasive in the industry. Wall Street’s consensus are 65-68% lower than the actual reported earnings versus 52-54% lower from the crowd’s consensus Third, we find economically and statistically significant evidence that tweet sentiment contains distinct information that is not contained in the traditional pre-announcement variables such as Forecasts Error, Earnings Surprise, Bias, Coverage, Track Record, and Earnings Volatility. Fourth, we show that Tweet sentiment prior to the earnings announcement date can actually predict post-announcement risk-adjusted excess returns over the short-term (few days). This predictive relationship holds even in the presence of the Earnings Surprise variable. Fascinatingly enough the market quickly incorporates this information and after only a few days the statistical significance of this relationship wanes. We estimate that gross of costs, the “alpha” from tweet sentiments post-earnings announcement may be as high as ~10%-20% per year.

Tweet Sentiments And Crowd-Sourced Earnings Estimates As Valuable Sources Of Information Around Earnings Releases – INtroduction

One frontier of empirical finance that has continued to rapidly expand due to a proliferation of new and exciting data is the realm of research based on listening to the “crowd.” Recently, we have witnessed an explosion of research activity examining such crowd-sourced data generated by social network sites such as Twitter/StockTwits (Bollen, Mao, and Zeng (2009), Zhang, Fuehres and Gloor (2011), Ruiz (2012), Forbergskog and Blom (2013), Sul, Dennis, and Yuan (2014)), Seeking Alpha (Chen, De, Yu, and Hwang(2014)), Estimize (Drogen and Jha (2013), Bliss and Nikolic (2015), Jame, Johnston, Markov, and Wolfe (2015)), and iSentium (Liew and Wang (2015)), to name a few.

Researchers have started to determine with alarming success that the “crowd” matters with regard to financial markets. However, should it really come as a surprise given that Wikipedia, Yelp, TripAdvisor and Amazon, which rely on peer-reviews by the masses, i.e. the “crowd,” are sites that have woven into our daily lives? It appears that we search for what the “crowd” has to say when it comes to our own personal lives. In fact, it appears that we have already started to trust the crowd’s opinion more and more.

The theory of the “wisdom of crowds” argues that a diverse group of responses often times, surprisingly, outperforms responses given by knowledgeable experts, Surowiecki (2004). What does the crowd say about the recently opened boutique ice cream shop down the street, just Yelp it. What about which hotel to stay when you take your family trip to Harry Potter World in Florida, just TripAdvisor it. What about some subject you want to learn because you have to teach it? Just Wiki it! What about the quarterly earnings of a company, say Apple? Just Estimize it!?!

Even the Tech guru – Philip Elmer-DeWitt (@philiped) has started to listen in on what the crowd has to say. Recently he wrote about the estimated earnings per share of Apple (AAPL). All 450 armchair analysts opined in on AAPL’s earnings for FQ3 2015 on the Estimize platform. This time the Estimize consensus of $1.86 per share was much closer to the actual EPS of $1.85 per share, well a penny off, versus that of the Wall Street consensus of $1.79 per share.

Antidotal evidence aside, clearly, some have taken notice and started to examine the data from an academic rigorous vantage. In light of the Efficient Market Hypothesis (EMH) of Fama (1970, 1991), what does this new source of information have to do with security price behavior? Can such information help determine the cross-section of expected returns beyond what CAPM (Sharpe (1964), Lintner(1965)) predicts? CAPM has since been adjusted to include factors related to “size” (Banz (1981)) and “value” (Rosenberg et al (19xx)). The Fama-French 4-factor model (Fama and French (1993, 1996)) has been extended successfully by Carhart (1997) to include the momentum factor attributed to Asness(1994) and Jegadeesh and Titman(1993). Will there be some factors related to social networks? It appears that we as a community are headed in that direction. Antidotal evidence appears to be building and soon we may have to adjust the Carhart 5-factor model for a “social-media” factor. Only time will tell.

In this paper we are concerned with better understanding two data sets and their information content on earnings announcement and consequential security price behavior. We will examine the earnings announcement period for our sample of companies spanning the period from November 2011 to December 2014. Each year companies are mandated to disclose their earnings four times. It’s a heighted event for Wall Street and for the companies. Equity analysts read their valuation models and try to predict the level of earnings. Successful Wall Street analysts are reward handsomely with lucrative bonuses and name recognition.

Evidence appear to be built that Wall Street analysts have been beaten by the crowd, see Drogen and Jha (2013) and Bliss and Nikolic (2014). Also, the divergence of opinion on earnings appears to have influence on the velocity of dissemination of security prices.

We continue to investigate this vein of research by examining the Estimize dataset. However, in this work we link tweet sentiment dataset provided by iSentium. iSentium have their proprietary sentiment engine that takes text tweets and coverts them to a score between -30 and 30 with 30 being the most bullish and -30 the most bearish. Prior work has shown that these sentiment can predict the cross-section of IPOs, Liew and Zhang(2015). In this paper we merge the two databases and bring in the security prices. We are interested in testing the following questions:

(1) Does tweet sentiment help reduce the earning’s announcement forecast error?

(2) Does this gain in accuracy hold with inclusion of standard variables known-to influence earning’s announcement forecast error?

(3) Does tweet sentiment have any relationship to on post-earnings announcement risk-adjusted drift?

Literature Review

A considerable amount of studies have demonstrated the “wisdom of crowds” in various disciplines. Page (2007) argues that the diversity in a group leads to collective wisdom. He provides empirical evidence in a cognitive sense that gives strong support to such a statement. Collective wisdom is also analyzed by Landemore and Elster (2012) from both theoretical and empirical aspects. Predicting the market is introduced in their work as one of the most important applications of such collective wisdom. In finance, the “wisdom of crowds” in forecasting equity markets is supported by prior studies. Schijven and Hitt (2012) analyze the cause of the “wisdom of crowds” and conclude that the buy-side

1, 2  - View Full Page