Ed Thorp: A Mathematician On Wall Street – Statistical Arbitrage
Ed Thorp: Statistical Arbitrage – Part I
The pioneer of statistical arbitrage guides us through a typical day at the office
“Thorp, my advice is to buy low and sell high.” — Mathematician William F. DonaghueLessons From Charlie Munger's Partnership
Charlie Munger, Warren Buffett’s right-hand man today, is an accomplished investor in his own right. Just like Buffett, Munger had his own investment partnership (he was convinced to go into investing by Buffett, leaving behind a career in law) before coming to Berkshire Hathaway. However, unlike Buffett who followed a deep value investing strategy as Read More
It’s the spring of 2000 and another warm sunny day in Newport Beach. From 600 feet high on the hill I look 30 miles over the Pacific at Wrigley’s 26-mile-long Catalina Island, stretched across the horizon like a huge ship. To the left, 60 miles away, the top of equally large San Clemente Island is visible peeping above the horizon. The ocean ends two and a half miles away, with a ribbon of white surf breaking on wide sandy beaches. An early trickle of fishingand sail boats stream into the sea from Newport Harbor, one of the world’s largest small-boat moorings, with more than 8,000 sail and power vessels, and some of the most expensive luxury homes in the world. Whenever I leave on vacation I look back over my shoulder and wonder if I’m making a mistake.
As I finish breakfast the sun is rising over the hills to the east behind me. It illuminates the tops of three financial towers to the west in the enormous business and shopping complex of Fashion Island. By the time the buildings are in full sun I make the 3-mile drive to my office in one of them.
Statistical arbitrage in action
Logging onto our computer system, I learn that we have already traded more than a million shares electronically and are ahead $400,000 in the first hour of trading. We’re currently managing a temporary high of $340 million, with which we have established positions of $540 million worth of stocks long, and an equal dollar amount short, consistent with our policy of keeping our portfolio dollar neutral. We know both from computer simulations and historical experience that our dollar neutral portfolio will also generally be close to market neutral. Market neutral means that the fluctuations in the value of the portfolio have very little relationship with the price changes in whatever benchmark is chosen to represent the market. For example one might choose as a benchmark for an equity portfolio the S&P 500 Index or the Wilshire 5,000 index. In early 2000, after the 7 years and 5 months that we have operated the current system, our level of market neutrality as measured by what financial theorists call beta has averaged about 0.06 on a pre-fee unlevered basis with zero being completely market neutral and 1.0 representing the market itself. Our alpha, which measures risk-adjusted excess return, the amount by which our annualized return has exceeded that from investments of comparable risk, has averaged about 20 per cent per year. This means that our past annual rate of return before fees of 26 per cent can be thought of as the sum of three parts: 5 per cent from Treasury bills with no risk, about 1 per cent from our slight market bias of 0.06 (0.06 times the markets’ average annual return over the 7 years and 5 months of our track record is roughly 1 per cent) plus the difference, a risk adjusted excess return of 20 per cent. We were essentially market neutral.
Using our proprietary prediction model, our computers continually calculate a “fair” price for each of about one thousand of the largest, most heavily traded companies on the New York and American Stock Exchanges. Stocks with large trading volume are called “liquid”; they are easier to trade without inducing a large “market impact” cost. The latest prices flow into the computers in “real time” and when the deviation from our calculated price for a security is large enough, we buy what we predict are the underpriced stocks and short the overpriced stocks. To control risk, we limit each stock we own to 2.5 per cent of our long portfolio. If every long position were at 2.5 per cent then we would have 40 stocks long. But we are continually entering and exiting positions so at any one time we can have between 150 and 300 stocks on the long side. Even with this level of risk control, plus additional constraints that tend to limit industry or sector concentration, we can have some nasty surprises. These come in the form of unexpected major company developments that we can’t predict, such as a disappointing earnings announcement. If we were to suddenly lose 40 per cent of a 2.5 per cent position, the portfolio could drop 1 per cent. Fortunately we rarely get more than one of these “torpedoes” per month. We also get about as many favorable surprises, leading to comparable windfall gains.
We limit each short position to 1.5 per cent of the portfolio so we would have 67 positions if all were at full size. In practice we typically have 150 to 300 positions because we’re always in the process of building new positions and taking off old ones. Our limit on the size of short positions is lower than for long positions because a sudden adverse move in a short position can be greater than for a long position. The worst outcome for a loss on a stock held long is for the stock to suddenly become worthless, leading to a loss of 100 per cent of the original value of the position.
Similarly, if a stock sold short doubles in price, the seller loses 100 per cent of the original value of the stock sold short. But the stock could triple or quadruple in price, or worse, leading to losses of 200 per cent, 300 per cent or more, of the original value of the stock sold short.
Our caution and our risk control measures seem to work. Our daily, weekly and monthly results are “positively skewed,” meaning that we have substantially more large winning days, weeks and months than losing ones, and the winners tend to be bigger than the losers.
I scan the computer screen, which is showing me the day’s 48 most interesting positions, including the 12 biggest gainers and the 12 biggest losers on the long side, and the same for the short side. By comparing the first few with the rest of the 12 in its group, I can see quickly if any winners or losers seem unusually large. Everything looks normal. Then I walk down the hall to Steve Mizusawa’s office. Steve is watching his Bloomberg terminal checking for unusual events that are not part of our prediction model but might have a big impact on one of the stocks we trade. When he sees one of these events, such as the announcement of a merger, takeover, spinoff or reorganization, he tells the computer to put the stock on the restricted list: don’t initiate a new position and close out any that we have. The restricted list also has those stocks which we are unable to sell short, due to our brokers’ inability to borrow them.
1.5 billion shares a year
Steve tells me, with his characteristic soft spoken modesty, that the broker where we do about two thirds of our business has reduced our commissions by about 0.16¢ per share. Steve has been working to achieve this but, as usual, doesn’t claim credit.
To appreciate the savings you need to realize how much we trade. Our portfolio turns over about once every ten days, and with about 253 trading days per year, that’s about 25 times per year. A turnover at current levels means we sell $540 million of stocks held long and replace them with $540 million of new stocks, for a total value traded of $1,080 million. We also cover (buy back) $540 million of stock previously sold short and sell $540 million of new shorts, for another $1,080 million. So one turnover means about $2,160 million and 25 turnovers a year means we are trading at the rate of about $54 billion per year. With an average price of $36 per share we’re trading 1.5 billion shares a year. Famed hedge fund manager Michael Steinhardt, when he retired recently, astonished many by announcing he had traded a billion shares in one year. The Medallion Fund, a hedge fund closed to new investors, run by mathematician James Simons, includes a similar even larger trading operation with a higher rate of turnover and a greater annual trading volume.
Our 1.5 billion shares a year amounts to about 6 million shares a day, over 0.5 per cent of the total NYSE volume. A reduction of 0.16¢ on two thirds of our trades, or one billion of those shares, saves us $1.6 million a year. At an average trade size of 1,500 shares, we’re making 4,000 trades a day or 1 million trades (tickets) a year. At an all inclusive average cost of about 0.74¢ per share, our 1.5 billion shares a year generate $11.1 million a year in commissions and ticket charges. Add to this another $1.8 million per year profit which the brokers make from lending us $210 million, and another $1.4 million or more per year in profits to them from lending us stock to sell short, and our brokers currently collect about $14.3 million per year from us. Our main broker was smart to stay competitive by reducing rates.
Statistical Arbitrage – Part II
In the late 1970s affordable, powerful computers and high quality databases were becoming more affordable, making a revolution in Finance possible
“The harder I work, the luckier I get.” — Alan (Ace) Greenberg, Chairman, Bear Stearns.
Why do we (and others) call it statistical arbitrage? Arbitrage originally meant a pair of offsetting positions that lock in a sure profit. An example might be selling gold in London at $300 an ounce while at the same time buying it at, say $290 in New York. Suppose the total cost to finance the deal and to insure and deliver the New York gold to London is $5, leaving a $5 sure profit. That’s an arbitrage in its original usage. Later the term was expanded to describe investments where risks are expected to be largely offsetting, with a profit that is likely, if not certain. For instance, to illustrate what is called merger arbitrage, company A trading at $100 a share may offer to buy company B, trading at $70 a share, by exchanging one share of company A for each share of company B. The market reacts instantly and company A’s shares gap to, say, $88 while company B’s shares jump to $83. Merger arbitrageurs now step in, buying a share of B at $83 and selling short a share of A at $88. The deal is expected to close in three months and, if it does, the arbitrageur will (without leverage) make $5/$83 or 6 per cent in three months, a simple interest annual rate of 24 per cent. But the deal is not certain until it gets regulatory and shareholder approval, so there is a risk of loss if the deal fails and the prices of A and B reverse. If, for instance, the stocks of A and B returned to their pre-announcement prices, the arbitrageur would lose $12=$100-$88 on his short sale of A and lose $13=$83-$70 on his purchase of B, for a total loss of $25 per $83 invested, or a loss of 30 per cent in three months, a simple annual rate of –120 per cent. For the arbitrageur to take this kind of risk, he must believe the chance of failure to be quite small.
What we do has the risk reducing characteristics of “arbitrage” but the two hundred or so stocks in each of the two “sides” of the portfolio, the long side and the short side, are generally not “related” or “linked” to each other. We depend on the statistical behavior of a large number of favorable bets to eventually deliver our profit. This is card counting at blackjack again, on a much larger scale. Our average trade size, or “bet,” is $54,000 and we are placing a million bets a year, or about one bet every six seconds that the market is open.
As I walk back to my office I think about how our statistical arbitrage venture came to be.
I first met Steve about 1970 when I was a mathematics professor at the University of California at Irvine and he was a double major in Physics and Computer Science. Steve and a friend did an imaginative special studies summer research project using a computer to study blackjack, under my guidance. Then in 1973 when Princeton-Newport Partners was expanding, I remembered Steve fondly and, fortunately, he was one of the first people I hired. Able to solve difficult problems in both computer hardware and software, Steve’s smart, hardworking understated manner earned his reputation in our firm as “the man who can do anything.”
The Indicators Project and the Discovery of Statistical Arbitrage
By 1978 I had moved from the mathematics department to the Graduate School of Management at UCI, which enabled me to teach courses in finance. After many stimulating discussions with Dr. Jerome Baesel, the professor in the next office, he came to work full time at Princeton-Newport Partners. A major responsibility was to direct research on an idea of mine which we called the indicators project. Neither Jerry nor I believed the efficient market theory. We had overwhelming evidence of inefficiency, i.e. systems that worked, from blackjack, to the history of Warren Buffett and friends, to our daily success in Princeton-Newport Partners. The question wasn’t “Is the market efficient?” but rather “How inefficient is the market?” and “How can we exploit this?”
The idea of the indicators project was to study how the historical returns of securities were affected by the values of various indicators or characteristics such as the P.E. ratio, the book to price ratio, company size or market value, and scores of other fundamental and technical measures. Now this is a well-known and widely explored idea but back in 1979 it was daring, innovative, and with few exceptions roundly denounced by the massed legions of academia. The idea was timely because the necessary high quality databases and the powerful new computers with which to explore them were just becoming affordable.
By luck, almost immediately after we began the indicators project at the end of 1979, one of our researchers stumbled on the germinal idea for statistical arbitrage – a single indicator that ranked stocks from best to worst and offered a short-term forecast of their performance compared to the others. The idea was to rank stocks by their percentage change in price, corrected for splits and dividends, over a recent past period such as the last two weeks. We found that the stocks that were most up tended to fall relative to the market over the next few weeks and the stocks which were the most down tended to rise relative to the market. Using this forecast our computer simulations showed approximately a 20 per cent annualized return from buying the “best” decile of stocks, and selling short the “worst” decile. We called the system MUD for the recommended portfolio of “most up, most down” stocks. As my friend U.C.I. mathematician William F. Donoghue used to joke, little realizing how close he was to a deep truth, “Thorp, my advice is to buy low and sell high.” (He had the habit of calling even close friends by their last name.) The diversified portfolios of long and short stocks had mostly offsetting market risks so we had what we liked – a market neutral portfolio. But the total portfolio, even though it was approximately market neutral, suffered from moderately large random fluctuations. Spoiled by the continuing low risk and high return performance of Princeton-Newport Partners, we put statistical arbitrage aside for the time being.
Unknown to us, in 1982 or 1983 an ingenious researcher at Morgan Stanley invented another statistical arbitrage scheme with characteristics like ours but with substantially less variability. His project probably began trading real time in 1983. As his confidence increased with experience, it expanded in size. By 1985 it was a significant profit center at Morgan Stanley but the credit for its discovery, and the rewards from the firm, reportedly did not attach to the discoverer, Gerry Bamberger. While his boss Nunzio Tartaglia continued to expand the operation with great initial success, a dissatisfied Bamberger chose to leave Morgan Stanley.
The Money Machine Begins Operating
In 1985, as part of our business plan to add diversified “profit centers,” our Princeton office placed an ad saying Princeton-Newport Partners was seeking to bankroll people who had successful low risk market neutral quantitative strategies. Bamberger, now out of a job, was one of those who answered the ad. He described his strategy as high turnover, market neutral, and low risk, with a large number of stocks held long and a large number held short, at any one time. It sounded very much like our unused statistical arbitrage strategy, so even though we only knew the general characteristics of the portfolio, and none of the details of the trading algorithm, we had no difficulty believing the story. After we checked Bamberger’s background, I met with him in Newport Beach. Following lengthy negotiations, he told me how the strategy worked, once I gave my word that I would tell no one else unless either he okayed it or the information entered the public domain by some other route.
Gerry Bamberger was a tall trim Orthodox Jew with a very high IQ, an original way of looking at problems in finance, and a wry sense of humor. He spent several weeks working with us in Newport Beach. After a few days I noticed that Gerry always brought a brown bag for lunch and always ate a tuna salad sandwich. I finally had to ask, “How often do you have a tuna salad sandwich for lunch?” Gerry said, “Every day for the last six years.” He was a heavy smoker and I’m extremely sensitive to tobacco smoke – we did not hire smokers nor allow smoking in our office – so part of our negotiation was about how to handle this. We respected each other and worked out a compromise that met each of our needs.
Whenever Gerry needed a cigarette he would step outside our ground floor garden office. This is not the ordeal in Southern California that it could have been during an east coast winter.
Statistical Arbitrage – Part III
How a STAR was born from CPUs the size of refrigerators, and proved in practice
The Bamberger version of statistical arbitrage was driven by two key ideas. The main source of alpha was the short term reversal effect we had discovered in 1979/80. The main tool for risk reduction was to divide the universe of stocks into industry groups of from two to thirteen stocks and trade each group separately on a dollar-neutral basis. Thus the portfolio reduced risk from the market and various industry factors. To back test the system and to simulate real-time trading, we drew upon Princeton-Newport’s 1,100 square foot computer room filled with two million dollars worth of equipment, a domain ordered, organized and overseen by Steve Mizusawa. Inside were banks of gigabyte disk drives the size of washing machines, plus tape drives and CPUs the size of refrigerators. All this sat on a raised floor consisting of removable panels, under which snaked an ordered jungle of cables, wires and other connectors. The room had its own halogen system. In case of fire the room flooded with non-combustible “halogen” gas automatically within 80 seconds. Once this happened the room had too little oxygen for fire to burn or for people to breathe. We drilled on how to get out in time and how to trigger the halogen manually, if necessary. This was high tech in the mid ‘80s. It has since been obviated by the enormous increase in computer miniaturization, speed, and cheapness. Now, for instance, hard disks the size of a CD can hold several gigabytes. The room was chilled to a constant 60F by its own cooling system and had sealed doors and dust filters to keep the air clean. Since smokers strongly emit tiny particles for an hour or more afterwards, Gerry agreed, with a lot of good natured kidding, not to go in the computer room.
Our joint venture was funded by Princeton-Newport Partners and run in New York by Bamberger as a turn-key operation. We called it BOSS Partners, for “Bamberger (plus) Oakley Sutton Securities,” the latter a broker dealer serving Princeton-Newport Partners and related entities. On capital ranging from 30 to 60 million dollars, BOSS started earning in the 25 to 30 per cent annualized range in 1985. This gradually declined to around 15 per cent or so in 1987. Bamberger then elected to retire a millionaire. He felt that in the booming market of the ‘80s, a 15 per cent return was not worthwhile, and he wanted to simply enjoy life.
Princeton-Newport was returning 25 per cent or so net of fees to investors in 1987 so the current BOSS return of 15 per cent was not compelling. But I had developed another method that I thought would be a substantial improvement. After BOSS closed we programmed it and again earned satisfactory returns when werestarted in January of 1988. By chance, we missed the crash of ‘87. How would we have done? In October of 1987, market volatility began to rise. On Friday, October 16, the Dow, at 2,300 or so, fell more than 100 points, over 4%. On Monday, October 19 it fell again from about 2,200 to about 1,700, an unthinkable 508 points, or 23%, by far the greatest one day percentage drop in history. Computer simulations showed our new statistical arbitrage product would have had a good day. And the violent volatile days that followed produced excellent returns. This was a ship for riding out storms.
To control risk we replaced the segregation into industry groups by the statistical procedure called factor analysis. Factors are common tendencies shared by several, many, or all companies. The most important is the market factor. For each stock, this measures the tendency of that stock to mimic or track the market. Using historical prices, the daily returns on any stock can be expressed as the part due to its tendency to follow the market plus what’s left over, the residual. Financial theorists and practitioners have identified a large number of such factors that help explain securities prices, to a degree which varies with the stock and the factor. Some factors, like participation in a specified industry group or sector (e.g. oil, financial) mainly affect subgroups of stocks. Other macroeconomic factors like the market itself, short term interest rates, long term interest rates, and inflation, affect nearly all stocks.
The beauty of a statistical arbitrage product is that it can be designed to approximately neutralize as many of these factors as one desires. The portfolio becomes market neutral by zeroing out the market effect: constrain the relation between the long and short portfolios so that the total effect of the market factor on the long side is just offset by the total effect on the short side. The portfolio becomes inflation neutral, oil price neutral, etc., by doing the same thing with each of those factors. Of course, there is a trade-off: the reduction in risk is accompanied by a limitation in the choice of possible portfolios (only ones which are market neutral, inflation neutral, oil price neutral, etc., are now allowed) and, therefore, the attempt to reduce risk tends to reduce return.
We got help applying factor analysis to the model from John Blin, a former professor of probability and statistics, and Steve Bender, a former physicist. Their business, APT, computed and sold their current factor analysis for many traded securities. We called the new method “STAR” for “Statistical Arbitrage.” At the request of one of our investors we sent a trading history to BARRA, a world leader in researching and developing financial products. They tested STAR with their factor model E2, which had 55 industry factors and 13 macroeconomic factors. They found that our returns were essentially factor neutral. Our returns did not appear to come from lucky disguised bets on various factors.
If I could predict the performance of factors like the market, inflation, gold, etc., then I would bet on them instead of neutralizing them in the portfolio.
It was fortunate that we had evolved beyond the Bamberger model because, in simulation, its returns continued to fall. Moreover, after a good 1987, Tartaglia had reportedly expanded statistical arbitrage at Morgan Stanley in 1988 to some $900 million long and $900 million short, which had to drive down overall returns for the system. The rumor was that they lost between 6 per cent and 12 per cent, leading to the winding down of the product. If true, this misfortune may have come in part from the decline in performance of the original product and any modified version they might have evolved to, and also to the increasing market impact costs of the much larger trades. Compare the performance of our new approach, shown in Table 1.
Fees have been calculated as 20 per cent of profits. Expenses of the partnership may affect actual returns. 30 day T-Bill Annualized return was 7.7 per cent.
Along with the decline of statistical arbitrage at Morgan Stanley, people began leaving the quantitative systems group that was in charge of it. Among the departing was David E. Shaw, who in the words of Time magazine was “… a former professor of computer science at Columbia University, [who] had been wooed to Wall Street by Morgan Stanley, where he specialized in the arcane field of quantitative analysis – using computers to spot trends in the market.” Shaw was looking for $10 million in start-up capital. Princeton-Newport Partners was interested.
In the spring of 1988, Shaw and I spent the day in my Newport Beach office, along with some of our key people. We discussed his plan tolaunch an improved statistical arbitrage product, and expand from that base. Princeton-Newport Partners was able to put up the $10 million he wanted for start-up. We were very favorably impressed by Shaw and his ideas but we decided not to go ahead because we already had a good statistical arbitrage product. Shaw found other backing and created one of the most successful analytic firms on Wall Street. Later Shaw would become a member of the president’s science advisory committee. Using statistical arbitrage as a “core” profit center, he expanded into related hedging and arbitrage areas (the Princeton-Newport Partners business plan again), and hired large numbers of very smart well trained quantitative types from academia. One of his smart hires was Jeff Bezos, who, while researching business opportunities in 1994 for Shaw, got the idea for an online bookstore and left to start Amazon.com.
By August of 1988, it was clear that the government’s investigation of Jay Regan and others in the Princeton office of Princeton-Newport Partners was likely to be serious, protracted and costly. This led us to phase out our STAR venture with Blin and Bender as well as joint ventures with others. It also probably would have destroyed a venture with D.E. Shaw had we elected to proceed. In fact, one of our limited partners, the “fund of funds” Paloma Partners, picked up the Shaw option.
Statistical Arbitrage – Part IV
Life’s turns bring both simplicity and diversification with two new and more powerful approaches to statistical arbitrage
Princeton-Newport Partners began winding down in late 1988 and closed operations in 1989. Amidst the stress, our brains in California were hyperactive and we developed not one but two new and more powerful approaches to statistical arbitrage. But after Princeton- Newport Partners closed I wanted simplicity. We reduced to a small staff and focused on two areas: Japanese warrant hedging1 and investing in other hedge funds. Both went well.
Meanwhile I was sitting on new market neutral statistical arbitrage methods for beating the market that I wasn’t using, and had no immediate plans to use. I expected that continuing innovations by investors using related systems would, as is typical, gradually weaken the power of my methods. During this period, in 1991, I met with John Casti, mathematician, author of several well known science books,2 and a charming and imaginative intellect. He was associated with the Santa Fe Institute and thought it would be mutually beneficial for me to visit the Institute, give some talks, and interact with the distinguished group of people that congregated there. In addition to people like Murray Gell-Mann, Nobel Prize winner in physics, I’d finally meet “maverick” physicists3 Doyne Farmer and Norman Packard. Their roulette adventures were chronicled by Thomas Bass in his book The Eudaemonic Pie. They were key founders of the Institute, and its focus was on their area of expertise, complexity theory. 4 Farmer and Packard had just left the Institute to found The Prediction Company, where they were attempting to conquer the securities markets.
In one of my talks, believing I wouldn’t be using it again, I was going to reveal how statistical arbitrage worked. But as it happened, I didn’t make the trip.
Tales of wondrous returns, a friend’s urging, and a giant investor
Meanwhile, both good friend Jerry Baesel and a former giant investor from Princeton-Newport Partners came to us with tales of extraordinary returns from statistical arbitrage. Among the various operators were D.E. Shaw and company, still other former Morgan Stanley quants whose shops were springing up like dragon’s teeth, and some of my former Princeton-Newport Partners associates. I asked several of these former Morgan Stanley people, then and in subsequent years, if they knew how statistical arbitrage had started at Morgan Stanley. No one did. Only a couple of them had heard rumors of a nameless legendary “discoverer” of the Morgan Stanley statistical arbitrage system, who, presumably, was Bamberger – so thoroughly had recognition for his contribution disappeared.
If our statistical arbitrage system still worked, the giant investor – a multibillion dollar pension and profit sharing plan – was able to take up most or all of the capacity. Note that every stock market system is necessarily limited in the amount of money it can use to produce excess returns. One reason is that buying underpriced securities tends to raise the price, reducing or eliminating the mispricing, and selling short overpriced securities tends to reduce the price, once again reducing or eliminating the mispricing. Thus systems for beating the market are limited in size by the impact of their own trading.
Period of adjustment
By 1991, I had finally simplified to a small staff of just four people. Steve Mizusawa was hedging Japanese warrants, with some theoretical assistance from me. I was working with Steve and also managing a portfolio of hedge funds for myself, with help from Judy McCoy, who also was in charge of tax and financial reporting, helped Steve, and backed up Diane Sawyer, our office manager at the time.
Life was good and I was rich; I had more time to travel, vacation, read and think, and enjoy my family. I was ambivalent about returning to the investment hubbub, so I tried what I thought would be a time-efficient way to cash in on our statistical arbitrage knowledge. I discussed with Steve, who would be crucial in implementing any such venture, how to proceed. I went shopping for a partner to whom we could license our software for royalties.
I contacted Bruce Kovner, a wealthy and successful commodities trader who I knew from Princeton-Newport days. Kovner had started with the Commodities Corporation in the 1970s, then gone on to run his own commodities hedge fund, eventually making hundreds of millions for himself and more for his investors. On the Forbes list of the 400 wealthiest Americans since 1992, he was estimated to be worth $900 million in 1999.5
Jerry Baesel and I spent an afternoon with Bruce in the 1980s in his Manhattan luxury apartment discussing how he thinks and how he gets his edge in the markets.6 Kovner is a generalist, who sees connections before others do.
About this time he observed that large oil tankers were in such oversupply that the older ones were selling for little more than scrap value. Kovner formed a partnership to buy one. I was one of the limited partners. Here was an interesting hedge. We were partially protected against loss on the downside because we could always sell the tanker for scrap. But we had a substantial upside: Historically, the demand for tankers had fluctuated widely and so had their price. Within a few years, our refurbished 475,000 ton monster, the Empress Des Mers, was profitably plying the world’s sea lanes stuffed with oil. Later the partnership negotiated to purchase the largest tanker of them all, the 500,000 ton Seawise Giant. Unfortunately, while we were in escrow the ship unwisely ventured near Kharg Island in the Persian Gulf and Islamic artillery rendered it unsuitable for delivery to us. The Empress, however was operating profitably in the twenty-first century and paying me dividends. I liked to think of my part ownership as a 20 foot section just ahead of the forecastle.
The saga of the Empress Des Mers finally ended. A letter dated June 3, 2004 reports:
“Today the Empress ended her many years of service when she was steamed up on a beach in Chittagong, Bangladesh. Tomorrow she will begin to be cut up for scrap. This sad day occurred many years earlier than it should have due to severe changes in the tanker market and international regulations. Nevertheless, the Empress once again performed in the profitable manner that she has so often throughout her career. Scrap prices are at historic highs, which resulted in the Empress fetching almost 23 million.
Since we purchased the Empress almost 18 years ago, the ship has generated approximately 100 million dollars in trading profits and provided a return on investment averaging some 30 per cent annually. This single ship has out performed almost all other tanker companies over the period.”
Kovner referred me to a hedge fund in which he was a major investor, and I made a proposal to the general partner (GP). We would supply the software for a complete operating system and license it to the GP for 15 per cent of the GP’s gross income from the use of the product. I chose gross income to simplify the process of monitoring and verifying correct payment. We would train them and provide continuing counsel. The license fee also declined slowly over time to adjust for improvements they might add and for the obsolescence of the original system. But every time we agreed on a deal, the GP insisted on making yet another change in his favor. After agreeing to some of these, it became apparent that they were endless. My patience at an end, I terminated negotiations.
The economics of haggling
Most of us who have dealt with used-car dealers, with rug merchants, or who have bought and sold real estate, are familiar with a negotiation process perhaps best described as haggling. To illustrate, suppose a house you want is priced at $300,000. You offer $250,000. The seller counters at $290,000. You counter at $265,000, etc. Finally you agree to buy at $275,000. This stylized dance may involve cajolery, trickery and deceit, which you might be familiar with at the used car or rug buying level.
Wouldn’t it be simpler and more satisfying, as Warren Buffett prefers, for the seller to state his price and have the buyer take it or leave it. After all, that’s how it’s done in most stores in the U.S., isn’t it? How could you shop if the prices you compare aren’t firm?
Yet in business deals or “negotiations,” haggling is common, just as it was with the GP who haggled with me. What’s going on here? We’ll address that next time.
- See Forbes, November 25, 1991, pp. 96-99, “A Three Time Winner,” Risk Arbitrage in the Nikkei put warrant market of 1989-1990. Applied Mathematical Finance 2, 243-271 (1995).
- Among them, Complexification and Reality Rules: Picturing the World in Mathematics, Vols. I, II.
- So described by Thomas Bass in his book The Predictors, which tells the story of their attempt to beat the market.
- See, for example, the books Chaos, the Birth of a New Science by James Gleich and Does God Play Dice: The Mathematics of Chaos by Ian Stewart.
- Forbes, October 11, 1999, page 352.
6 See Schwager, 1989, pp. 51-83 for a long interview with Kovner.
Statistical Arbitrage – Part V
This issue we continue our discussion of haggling using our house example
Suppose the seller’s real lowest price is $260,000 and that the highest price the buyer is really willing to pay is $290,000. (The seller might find out, for instance, that the buyer will really pay $290,000 by reporting that he has another offer at $289,000, at which point the first buyer offers $290,000.) Thus any price between $260,000 and $290,000 is acceptable to both parties, even though neither party knows this at the time. So $30,000 is “up for grabs.” The objective of the haggle is to capture as much of this $30,000 as possible for one side or the other.
On the other hand, if instead the buyer is only willing to go to $270,000 and the seller’s (secret) lowest price is $280,000, there is no overlap, no price both will accept, and there will be no deal.
Just such a haggler determined where I lived for 20 years. We had decided to make a local move and had located a spectacular view lot high on a hill in Newport Beach. In the depressed 1979 real estate market it was offered at $435,000. We started at $365,000 and after a series of offers and counter offers, we eventually offered $400,000, which was countered at $410,000. We countered at $405,000, our absolute limit. Rejected. We walked. A few days later the seller relented and offered to meet our $405,000 price. But we didn’t accept. Why not?
At our absolute limit, we were almost indifferent as to whether we did the deal or not. Meanwhile the seller had now alienated us, and we preferred not to have any further dealings with him. Consequently his deal was less attractive and our top price now dropped below $405,000. And, we had begun to consider attractive alternatives. We soon bought a better lot, built a new house, and spent 21 happy years there. The haggler’s lot remained unsold for another decade.
Ironically, when we recently sold this house, we had another example of the losing haggle. After a year on the market we suddenly got two offers the same weekend. We were asking $5,495,000 and expecting to get about $5,000,000. One offer was at $4.6 million and the prospective buyer used his aggressive business partner to open negotiations. The partner’s in-your- face quarrelsome style and nitpicking criticism of the house was designed to beat down the price. He alienated us and our agent. The other offer was for $5 million from an agreeable family who loved the house “as is.” We accepted, upon which the other buyer begged us to reconsider, indicated he would meet or exceed the other price, and wouldn’t use “in-your-face.” Too bad. So they were relegated to being a back-up offer for the next two months in case our buyer dropped out. He didn’t. Lesson: it doesn’t pay to push the other party to their absolute limit. A small extra gain is generally not worth taking the substantial risk the deal will break up.
A Lesson for traders?
Knowing when to haggle for a small extra gain and when not to is valuable for traders. Let’s look at an example that could help you in your trades whether in the market or elsewhere.
In the days of Princeton-Newport one of our head traders used to crow about how, by regularly holding out for an extra eighth or quarter he saved us large amounts of money in the long run. Here’s the idea. Suppose we want to buy 10,000 shares of Microsoft (MSFT), currently trading at, say, 71 bid for 50,000 shares, and 711—4 asked for 10,000 shares. We can pay 711—4 now and buy our 10,000 shares. Or, as our trader would do, we can offer to buy our 10,000 shares at 711—8 and see if we have any takers. If this works, and it does most of the time, we’ll save $1—8 x 10,000 or $1,250.
This sounds good. Is there any risk? Yes. To see why, notice that we fail to save $1—8 per share only when the stock always trades at 711—4 or higher for however long we’re trying to buy. All those stocks we miss out on have moved higher and some of them have run away to the upside. Those runaways would have given us windfall profits. Put simply, you might scalp $1—8 twenty times and lose a $10 gain once. Do you like that arithmetic? I don’t.
I asked our Princeton-Newport trader how he could tell whether his scalping profits offset his losses from missed opportunities. He could not make a case for what he was doing. I asked other traders around the street the same question and didn’t find anyone who could clearly show that they gained more than they lost by scalping for eighths. But financial theory can give us some help here.
Markets are basic to modern economics and trading is the fundamental activity. Modern financial theorists have, therefore, intensively analyzed how markets work, both by analyzing data and developing theories to explain what they observe. They note that trades are initiated for a variety of reasons. Some of the initiators have no edge – no special advantageous information – probably including most of the people who do think they have an edge. Examples of these so-called noise traders might include an index fund selling a company because it was dropped from the index, or buying a stock that was added to its index, or, an estate liquidating to pay taxes, or a mutual fund buying or selling in response to cash additions and withdrawals. Of course, to the extent some worthwhile information is used in any of these trades, our examples are imperfect.
The other type of trade is initiated by traders who do have an edge. Examples might be the illegal insider trades made famous by the prosecutions of Ivan Boesky and others in the 1980s, and which continue to this day, or the legal trades made by those who act first on public information – an earnings announcement, a takeover, an interest rate change, etc.
Does all this really matter? What’s an eighth of a dollar a share? For Ridgeline Partners, trading one and half billion shares a year, it can add up. As President Lyndon Johnson once said about congressional spending, a billion dollars here, a billion dollars there, and pretty soon you’re talking about some real money.
Satisficers versus maximizers
The behavior of the hagglers and the traders reminds me of the behavioral psychology distinction between two extremes on a continuum of types: satisficers and maximizers. When a maximizer goes shopping, looks for a handyman, buys gas, plans a trip, etc. he or she searches for the best possible deal. Time and effort don’t matter much. If they miss the very best deal they feel regret and stress. On the other hand, the satisficer factors in the costs of searching and decisionmaking, as well as the risk of losing a near optimal opportunity and perhaps never finding anything as good again. This is reminiscent of the socalled secretary problem in mathematics.
Assume that you will interview n candidates, from which you can choose only one. You must consider them one at a time and having once rejected a candidate you cannot reconsider. Assuming that, ex post, they have desirability ranks from 1 to n with n being best, what ex ante strategy maximizes the probability of choosing the best? The well known answer is that after seeing a fraction f (n) (which tends to 1/e as n increases) of the candidates you should choose the next one (if any) whose rank exceeds those already seen. Instead of trying for the very best a satisficer might choose to solve an alternate version of the problem: maximize the expected rank (or, more generally utility) of his choice. The secretary problem has many such variations and has been intensively studied by mathematicians. It is part of the theory of optimal stopping and a google search on “secretary problem” (exact phrase) will turn up many interesting articles.
As mentioned earlier, I had discontinued discussions of a joint venture in statistical arbitrage with an endlessly haggling hedge fund general partner. Meanwhile, another group was interested. They were a financial engineering firm of about twenty people and we knew the principals from Princeton-Newport days. I proposed that we jointly build a new Princeton-Newport style hedge fund. We would provide the investors and the investment capital, the roadmap, the key software, and the overall direction and guidance. For this we would own a permanent share. Their organization would implement everything with our help, and eventually they would be running operations with our guidance and participation in decisions.
My business plan, based on the proven Princeton-Newport model, was to start with statistical arbitrage as a core profit center. After establishing that foundation for the business we would add as the next profit center convertible bond, warrant, option and other derivative hedging for which we still had cutting edge experience and computer software. Then we would expand into other areas I knew how to implement.
The venture began auspiciously. Our statistical arbitrage software ran smoothly, first in simulation and then with real money, starting in August of 1992 with a managed account for a large institutional investor. We had an enormous idea backlog which we eagerly anticipated leveraging through our associates to make the model more powerful. Months passed and the pace of research and development seemed negligible. We had continual trips, meetings, memos and telephone calls. My beautiful ideas were rotting on the vine for lack of follow through. It was clear that if I wanted significant research and development we would have to do it “in house.”
Steve and I hired an outstanding C++ programmer and as we worked together, research output in our office spurted. It was easier and quicker to do everything ourselves than to continue with our associates. We licensed them to use the 1994 version of our software and went our separate business ways. Meanwhile, my friend Frank Meyer (founder of the Glenwood fund of funds, later merged into Mann- Glenwood), along with the young quantitative investment genius Ken Griffin who Meyer had discovered trading from his Harvard dorm, was building a hedge fund to which I offered early advice and encouragement, using the same business plan I had proposed to our ex associates. In ten years Griffin’s market neutral Princeton-Newport style hedge fund operations – The Citadel Group – grew to $4.3 billion under management with annualized returns averaging more than 30 per cent and some 320 employees. It now stands at $11 billion with over 800 employees, one of the most valuable hedge fund franchises in the world. Our ex associates had blown off the chance of a lifetime.
Statistical Arbitrage – Part VI
The launch of Ridgeline Partners brings the author back to the hedge fund business
Time is the ultimate budget constraint. – Jerome B. Baesel to the author.
Why was I enticed back into the hedge fund business? Because we now had a large managed account from a savvy client that we knew well and for whom we were already trading successfully. I also had my own money to manage and this product seemed equal to or better than the outside hedge funds managers I could join. Best of all, it was intellectual fun to generate ideas to beat the market. So in August of 1994, in addition to the large institutional account we managed from August 1992, we launched Ridgeline Partners, a statistical arbitrage hedge fund. After a slow start in 1994 and 1995, the pretax return to limited partners (i.e. after all fees and expenses) annualized at 18 per cent for its eight and a quarter years of operation.
We begin by presenting results for the large managed account. This was for the pension and profit sharing plan of a Fortune 100 company which for confidentiality we call XYZ. The first table, XYZ Performance Summary, gives basic statistics for 2,544 trading days, just over 10 years. These results are before leverage and before fees. We’ll see later that the results for the investor were better because the gain from leverage more than covered fees.
The annualized return of 9.93 per cent and the annualized standard deviation of 16.91 per cent for the S & P 500 during this period are not far from its long-term values. The unlevered annualized return for XYZ before fees, at 21.10 per cent, is about twice that of the S & P and the standard deviation of 7.11 per cent is 60 per cent less. The ?/? ratio for XYZ at 2.97 is five times that of the S & P. Estimating 5 per cent as the average 3 month T-bill rate over the period, the corresponding Sharpe ratios are 0.29 for the S & P and 2.26 for XYZ.
Regressing daily returns on those for the S&P 500 shows that
R(XYZ) = 0.00074 + 0.05149 R(S&P) where R(·) is the corresponding daily return. Thus alpha was 7.4 basis points or 0.074 per cent per day which is about exp(0.00074 ? 252) ? 1 = 20.5 per cent annualized.
The next graph, XYZ Daily Performance, shows the daily fluctuations in portfolio value. The heavy black horizontal line is the meandaily return of 0.074 per cent.
Outliers, such as positive fluctuations greater than 1.5 per cent and negative fluctuations greater than 1 per cent, indicate a distinct increase in variability from about day 1,500 until about day 2,400 (and perhaps beyond?). With about 252 trading days per year, this corresponds to the period from about August 1, 1998 through the middle of February, 2002. The LTCM disaster occurred at the start and the dot com collapse and 9/11 occurred in the last couple of years of the period.
The graph “XYZ Performance Comparison I” shows the cumulative wealth relatives for XYZ (blue), the S & P 500 (magenta) and T-bills + 2 per cent (yellow). From about day 600 (the end of 1994) until about day 2,000 (about August 1, 2000) we see one of the great bull markets of all time. Over about 5.6 years the S & P 500 exploded at a 26 per cent annual compound rate, a cumulative wealth relative of about 3.7.
However the customary arithmetic graph exaggerates the triumph of XYZ over the S & P and it is instructive to plot the logarithm of the cumulative wealth relatives, which we do in XYZ Performance Comparison II. In this graph straight lines correspond to constant compound growth rates with the growth rate proportional to the slope of the line.
Laying a straight edge on the graph of the log of XYZ’s wealth relative, we see what appear to be two major “epochs.” The first, from day one (August 12, 1992) to about day 1,550 (early October, 1998) shows a nearly constant compound growth rate. The second epoch, from about day 1,550 until day 2,544 (September 13, 2002) has a higher overall rate of return, including a remarkable six month spurt just after the LTCM disaster. After the six month spurt (the last quarter of 1998 and the first quarter of 1999), the growth rate returns for the rest of the time to about what it was in the first epoch. However the variability around the trend is noticeably greater, as we’ve already seen in the “Daily Performance” chart.
The explanation for the greater variability might be from any of several causes. Among them may be the election of George W. Bush (the outcome was delayed and disputed until December 2000). Preceded by uncertainty settled around day 2,080, we have an economic sea change from budget surpluses to increased spending and massive deficits, caused by the tax rate reductions, the collapse of the dot com bubble, 9/11 and two wars. Also we had been continually revising, and hopefully improving, our stock selection algorithms.The choices we made may have contributed to increased variability, in hopes of higher expected returns.
The next graph, “RIDG Performance Comparison” shows us the results for Ridgeline Partners and a comparison with the corresponding XYZ chart gives us much new information.
First, the green line plots the log of the cumulative wealth relative as of each month end, as received by investors. Thus it incorporates the gains (in this case) from leverage and the reductions from the general partner’s fees. The overall result was that the increase in performance from leverage more than covered all the fees. These fees were 1 per cent per year, paid quarterly in advance, plus 20 per cent of profits paid only on “new high water.” The general partner also chose to reduce its fee on occasion after periods when performance was mediocre.
Many of today’s hedge fund managers would consider our fee policies economically irrational. Here’s why: We voluntarily gave back or reduced fees during some periods when we felt disappointed in our recent past performance. This accounted in total to more than a million dollars. We also had a waiting list during most of our history. Ridgeline was closed a large part of the time and even current partners were often restricted from adding capital. There were also occasions where we gave capital back to partners in order to reduce our size. As other hedge fund managers have demonstrated, under these conditions of excess demand we could have chosen to increase our fees either by raising our percentage of the profits or taking in “too much” capital and thereby driving down the net percentage return to limited partners. These strategies to capture nearly all the alpha for the general partner work according to economic theory and seem to also work in practice. Instead I prefer to treat limited partners as I would wish to be treated when I’m a limited partner.
Over the ten years of our latest statistical arbitrage operation, we ran several hundred million dollars using only 3.5 “full time equivalents” from our office. It was a highly automated, lean and profitable operation. The “shrink wrapped” software sits on our shelf and ought to have a tag saying, “add people and data to reactivate.” Why close down? Perhaps the most important reason for me was the increasing marginal value of (expected) time expended had exceeded the decreasing marginal value of (expected) money to be gained.