We are the whipping boy for a recent article on the dangers of data mining in our field. And the whipping is delivered largely based on an unsupported shot taken by my frequent foil and sparring partner, Rob Arnott. Before I take on this attack1 we need to back up a bit.

geralt / Pixabay

Data mining, that is searching the data to find in-sample patterns in returns that are not real but random, and then believing you’ve found truth, is a real problem in our field. Random doesn’t tend to repeat so data mining often fails to produce attractive real life returns going forward. And given the rewards to gathering assets, often made easier with a good “backtest,” the incentive to data mine is great. We’ve talked about it endlessly for years and written on it many times. But we’re not nihilists who believe everything is data mining.2,3 We are more likely to believe in-sample evidence when it’s also accompanied by strong out-of-sample evidence (across time, geography, and asset class4) and an economic story that makes sense.5 In that case, and barring exceptionally convincing evidence something has changed, we not only believe in it but will stick to it like grim death through its inevitable ups and downs. After many years of research and managing portfolios, we believe there are at least four widely known types of factors that are real (that is, they don’t just look good because of data mining).6,7,8 People are often shocked that we believe in only a few core investment concepts – somehow they think there are many more. Nope. For instance: No small firm effect. No January effect. No Super Bowl effect – though if you do believe that indicator, you should be shorting stocks this year because of Tom Brady; sorry if that’s deflating.

But all of this didn’t stop us from being the cannon fodder in this new article. Are we data miners? Heck no. We’ve always explicitly stood for the opposite. The list of things with great backtests we don’t believe in is legion, and the ones we do believe in have, again, tended to work through time, in a multitude of asset classes and across geographies, with many of these being out-of-sample tests of the original findings. But, when you are trying your best to come up with a story, you find people who will say what you need them to (a journalistic version of data mining!) about someone vaguely interesting (I guess we’re vaguely so). So the reporter asked a non-objective guy with whom I’ve feuded to opine on me and by extension AQR.9,10 It’s no secret to readers of this perspectives column, and our work in general, that we have had an intellectual dispute with Rob Arnott on the subject of whether the main factors commonly discussed are something one should time (get in/out based on how expensive they look versus history).11 A secondary debate has indeed been whether some of the main factors in finance12 are the result of long-term data mining. Well, perhaps because he lost the first point, Arnott upped the secondary topic to primary and unleashed on me in the Businessweek piece, leading with a tiny bit of honey about my “outstanding” prior work but then bringing on a big bowl of vinegar flavored whoop-ass.13 Rob says,

“I think Cliff has done some outstanding work over the years,” but adds that he’s “insufficiently skeptical about the pervasiveness of data-mining and its impact even in the factors he uses.”

That is, he says I’m a data-miner. That may seem like an innocuous little comment actually prefaced with a kind of compliment. It’s not. It’s a damning accusation that’s provably false, backwards in fact. Worse, it’s a falsehood meant to deflect and confuse as it kind of rhymes with a separate dispute we’ve been having, the “secondary debate” mentioned above – a dispute Rob’s been ducking. So, if you just read Rob’s comment on its own by most peoples’ standards I’m overreacting here. Admittedly that’s kind of my go-to move. But, in the broader context, the ongoing debate and what a serious and backwards “shot” he really took, I think I’m reacting appropriately. Of course, I usually think that…

After Rob’s quote the article provides a response from me. They actually ran a somewhat truncated version of what I said. Here is the verbatim response I sent the reporter to Rob’s above comment,

“Rob and AQR largely believe in a very similar set of factors like value, low risk, and momentum, to which we think we’ve both applied a lot of a priori skepticism. Protestations otherwise are marketing tactics and reflect an ongoing confusion between factor timing, which he believes in more than we do, and long term factor efficacy.”

That kind of says it all but way too briefly and calmly for my taste; hence, this longer version you’re reading now.

In the first part of the above quote I was making a very simple point. Rob and Research Affiliates publicly claim to believe in, and run investment products based on, factors that largely overlap with AQR. Now, for competitive reasons I wish they’d stop, but it’s a free country. Value, low risk, and momentum are all things to various degrees we both believe in, and when it comes to investing in equities, it covers a large part of what each of us do.14  Check out how his firm describes one of its products. What exactly does he claim we believe in because of data mining that isn’t in his list here? I mean, if I and AQR are data miners, then double data-mining on you Rob!15 That he’d accuse us of being “insufficiently skeptical” about the dangers of data mining isn’t just at odds with our long history of the exact opposite, but bat**** crazy when it’s mostly the stuff he believes in too. I guess he’s hoping nobody noticed. I noticed. I notice things, particularly when they are about me and they are so very noticeable.16

Let me be clear. Rob doesn’t actually think the factors we at AQR believe in are data mined any more than he believes his own are data mined. That’s a smokescreen. What Rob is doing here is the time honored strategy of the best defense is a good offense combined with the old adage about pounding the table when you’ve got nothing. Separate from this kerfuffle Rob has actually accused most of the field of applied finance of data mining in a very specific way and we’ve shown he’s wrong (at least massively exaggerating). Apparently he doesn’t like that so we have this deflection. Please note, he’s not wrong that data mining is a big problem, everybody reasonable thinks that certainly including me. Whoever shouts it louder at others doesn’t necessarily believe it more. But his very specific accusation against the field about a very specific type of data mining has no teeth. Unfortunately to understand this we have to get much more into the geeky weeds, sorry…

Rob has made claims in various papers that some of the major factors that much

1, 23  - View Full Page