Abstract: The Journal of FinanceVolume 52, Issue 1 p. 35-55 ArticleFree Access The Limits of Arbitrage Andrei Shleifer, Andrei ShleiferSearch for more papers by this authorRobert W. Vishny, Robert W. Vishny Shleifer is from Harvard University and Vishny is from The University of Chicago. Nancy Zimmerman and Gabe Sunshine have helped us to understand arbitrage. We thank Yacine Aït Sahalia, Douglas Diamond, Oliver Hart, Steve Kaplan, Raghu Rajan, Jésus Saa-Requejo, Luigi Zingales, Jeff Zwiebel, and especially Matthew Ellman, Gustavo Nombela, René Stulz, and an anonymous referee for helpful comments.Search for more papers by this author Andrei Shleifer, Andrei ShleiferSearch for more papers by this authorRobert W. Vishny, Robert W. Vishny Shleifer is from Harvard University and Vishny is from The University of Chicago. Nancy Zimmerman and Gabe Sunshine have helped us to understand arbitrage. We thank Yacine Aït Sahalia, Douglas Diamond, Oliver Hart, Steve Kaplan, Raghu Rajan, Jésus Saa-Requejo, Luigi Zingales, Jeff Zwiebel, and especially Matthew Ellman, Gustavo Nombela, René Stulz, and an anonymous referee for helpful comments.Search for more papers by this author First published: 18 April 2012 https://doi.org/10.1111/j.1540-6261.1997.tb03807.xCitations: 2,738 AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL ABSTRACT Textbook arbitrage in financial markets requires no capital and entails no risk. In reality, almost all arbitrage requires capital, and is typically risky. Moreover, professional arbitrage is conducted by a relatively small number of highly specialized investors using other people's capital. Such professional arbitrage has a number of interesting implications for security pricing, including the possibility that arbitrage becomes ineffective in extreme circumstances, when prices diverge far from fundamental values. The model also suggests where anomalies in financial markets are likely to appear, and why arbitrage fails to eliminate them. One of the fundamental concepts in finance is arbitrage, defined as "the simultaneous purchase and sale of the same, or essentially similar, security in two different markets for advantageously different prices" (Sharpe and Alexander (1990)). Theoretically speaking, such arbitrage requires no capital and entails no risk. When an arbitrageur buys a cheaper security and sells a more expensive one, his net future cash flows are zero for sure, and he gets his profits up front. Arbitrage plays a critical role in the analysis of securities markets, because its effect is to bring prices to fundamental values and to keep markets efficient. For this reason, it is extremely important to understand how well this textbook description of arbitrage approximates reality. This article argues that the textbook description does not describe realistic arbitrage trades, and, moreover, the discrepancies become particularly important when arbitrageurs manage other people's money. Even the simplest realistic arbitrages are more complex than the textbook definition suggests. Consider the simple case of two Bund futures contracts to deliver DM250,000 in face value of German bonds at time T, one traded in London on LIFFE and the other in Frankfurt on DTB. Suppose for the moment, counter factually, that these contracts are exactly the same. Suppose finally that at some point in time t the first contract sells for DM240,000 and the second for DM245,000. An arbitrageur in this situation would sell a futures contract in Frankfurt and buy one in London, recognizing that at time T he is perfectly hedged. To do so, at time t, he would have to put up some good faith money, namely DM3,000 in London and DM3,500 in Frankfurt, leading to a net cash outflow of DM6,500. However, he does not get the DM5,000 difference in contract prices at the time he puts on the trade. Suppose that prices of the two contracts both converge to DM242,500 just after t, as the market returns to efficiency. In this case, the arbitrageur would immediately collect DM2,500 from each exchange, which would simultaneously charge the counter parties for their losses. The arbitrageur can then close out his position and get back his good faith money as well. In this near textbook case, the arbitrageur required only DM6,500 of capital and collected his profits at some point in time between t and T. Even in this simplest example, the arbitrageur need not be so lucky. Suppose that soon after t, the price of the futures contract in Frankfurt rises to DM250,000, thus moving further away from the price in London, which stays at DM240,000. At this point, the Frankfurt exchange must charge the arbitrageur DM5,000 to pay to his counter party. Even if eventually the prices of the two contracts converge and the arbitrageur makes money, in the short run he loses money and needs more capital. The model of capital-free arbitrage simply does not apply. If the arbitrageur has deep enough pockets to always access this capital, he still makes money with probability one. But if he does not, he may run out of money and have to liquidate his position at a loss. In reality, the situation is more complicated since the two Bund contracts have somewhat different trading hours, settlement dates and delivery terms. It may easily happen that the arbitrageur has to find the money to buy bonds so that he can deliver them in Frankfurt at time T. Moreover, if prices are moving rapidly, the value of bonds he delivers and the value of bonds delivered to him may differ, exposing the arbitrageur to additional risks of losses. Even this simplest trade then becomes a case of what is known as risk arbitrage. In risk arbitrage, an arbitrageur does not make money with probability one, and may need substantial amounts of capital to both execute his trades and cover his losses. Most real world arbitrage trades in bond and equity markets are examples of risk arbitrage in this sense. Unlike in the textbook model, such arbitrage is risky and requires capital. One way around these concerns is to imagine a market with a very large number of tiny arbitrageurs, each taking an infinitesimal position against the mispricing in a variety of markets. Because their positions are so small, capital constraints are not binding and arbitrageurs are effectively risk neutral toward each trade. Their collective actions, however, drive prices toward fundamental values. This, essentially, is the model of arbitrage implicit in Fama's (1965) classic analysis of efficient markets and in models such as CAPM (Sharpe (1964)) and APT (Ross (1976)). The trouble with this approach is that the millions of little traders are typically not the ones who have the knowledge and information to engage in arbitrage. More commonly, arbitrage is conducted by relatively few professional, highly specialized investors who combine their knowledge with resources of outside investors to take large positions. The fundamental feature of such arbitrage is that brains and resources are separated by an agency relationship. The money comes from wealthy individuals, banks, endowments, and other investors with only a limited knowledge of individual markets, and is invested by arbitrageurs with highly specialized knowledge of these markets. In this article, we examine such arbitrage and its effectiveness in achieving market efficiency. In particular, the implications of the fact that arbitrage—whether it is ultimately risk-free or risky—generally requires capital become extremely important in the agency context. In models without agency problems, arbitrageurs are generally more aggressive when prices move further from fundamental values (see Grossman and Miller (1988), De Long et al. (1990), Campbell and Kyle (1993)). In our Bund example above, an arbitrageur would in general increase his positions if London and Frankfurt contract prices move further out of line, as long as he has the capital. When the arbitrageur manages other people's money, however, and these people do not know or understand exactly what he is doing, they will only observe him losing money when futures prices in London and Frankfurt diverge. They may therefore infer from this loss that the arbitrageur is not as competent as they previously thought, refuse to provide him with more capital, and even withdraw some of the capital—even though the expected return from the trade has increased. We refer to the phenomenon of responsiveness of funds under management to past returns as performance based arbitrage. Unlike arbitrageurs using their own money, who allocate funds based on expected returns from trades, investors may rationally allocate money based on past returns of arbitrageurs. When arbitrage requires capital, arbitrageurs can become most constrained when they have the best opportunities, i.e., when the mispricing they have bet against gets even worse. Moreover, the fear of this scenario would make them more cautious when they put on their initial trades, and hence less effective in bringing about market efficiency. This article argues that this feature of arbitrage can significantly limit its effectiveness in achieving market efficiency. We show that performance-based arbitrage is particularly ineffective in extreme circumstances, where prices are significantly out of line and arbitrageurs are fully invested. In these circumstances, arbitrageurs might bail out of the market when their participation is most needed. Performance based arbitrage, then, is even more limited than arbitrage described in earlier models of inefficient markets, such as Grossman and Miller (1988), De Long et al. (1990), and Campbell and Kyle (1993). Ours is obviously not the first study of the consequences of delegated portfolio management. Early articles in this area include Allen (1990) and Bhattacharya-Pfleiderer (1985). Scharfstein and Stein (1990) model herding by money managers operating on incentive contracts. Lakonishok, Shleifer, Thaler, and Vishny (1991) and Chevalier and Ellison (1995) consider the possibility that money managers "window dress" their portfolios to impress investors. In two interesting recent articles, Allen and Gorton (1993) and Dow and Gorton (1994) show how money managers can churn assets to mislead their investors, and how such churning can sustain inefficient asset prices. Unlike this work, our article does not focus as much on the distortions in the behavior of arbitrageurs, as on their limited effectiveness in bringing prices to fundamental values. The next section of the article presents a very simple model that illustrates the mechanics of arbitrage. For simplicity, our model focuses on the case where mispricing may deepen in the short run, even though there is no long run fundamental risk in the trade. We thus focus on a case that is closest to pure arbitrage, as opposed to risk arbitrage. Section II establishes the main results of the article, including our results on the effectiveness of arbitrage in extreme circumstances when prices are very far from fundamentals. Section III explores the performance-based arbitrage assumption in more detail. In section IV, we examine some empirical implications of the model. In particular, we extend the logic of the model to the more realistic case of risk arbitrage, rather than the pure arbitrage case modeled in the article. We first ask what are the characteristics of markets in which we expect risk arbitrage resources to be concentrated. We then analyze return predictability and pricing anomalies more generally. Section V concludes. I. An Agency Model of Limited Arbitrage The structure of the model follows Shleifer and Vishny (1990). We focus on the market for a specific asset, in which we assume there are three types of participants: noise traders, arbitrageurs, and investors in arbitrage funds who do not trade on their own. Arbitrageurs specialize in trading only in this market, whereas investors allocate funds between arbitrageurs operating in both this and many other markets. The fundamental value of the asset is V, which arbitrageurs, but not their investors, know. There are three time periods: 1, 2, and 3. At time 3, the value V becomes known to arbitrageurs and noise traders, and hence the price is equal to that value. Since the price is equal to V at t = 3 for sure, there is no long run fundamental risk in this trade (this is not risk arbitrage). For t = 1, 2, the price of the asset at time t is p t . For concreteness, we only consider pessimistic noise traders. In each of periods 1 and 2, noise traders may experience a pessimism shock S t , which generates for them, in the aggregate, the demand for the asset given by: QN ( t ) = [ V - S t ] / p t . (1) At time t = 1 , the first period noise trader shock, S 1 , is known to arbitrageurs, but the second period noise trader shock is uncertain. In particular, there is some chance that S 2 > S 1 , i.e., that noise trader misperceptions deepen before they correct at t = 3 . De Long et al. (1990) stressed the importance of such noise trader risk for the analysis of arbitrage. Both arbitrageurs and their investors are fully rational. Risk-neutral arbi trageurs take positions against the mispricing generated by the noise traders. Each period, arbitrageurs have cumulative resources under management (including their borrowing capacity) given by F t . These resources are limited, for reasons we describe below. We assume that F 1 is exogenously given, and specify the determination of F 2 below. At time t = 2 , the price of the asset either recovers to V, or it does not. If it recovers, arbitrageurs invest in cash. If noise traders continue to be confused, then arbitrageurs want to invest all of F 2 in the underpriced asset, since its price rises to V at t = 3 for sure. In this case, the arbitrageurs' demand for the asset QA ( 2 ) = F 2 / p 2 and, since the aggregate demand for the asset must equal the unit supply, the price is given by: p 2 = V - S 2 + F 2 . (2) We assume that F 2 < S 2 , so the arbitrage resources are not sufficient to bring the period 2 price to fundamental value, unless of course noise trader misperceptions have corrected anyway. In period 1, arbitrageurs do not necessarily want to invest all of F 1 in the asset. They might want to keep some of the money in cash ia case the asset becomes even more underpriced at t = 2 , so they could invest more in that asset. Accordingly, denote by D 1 the amount that arbitrageurs invest in the asset at t = 1 . In this case, QA ( 1 ) = D 1 / p 1 , and p 1 = V - S 1 + D 1 . (3) We again assume that, in the range of parameter values we are focusing on, arbitrage resources are not sufficient to bring prices all the way to fundamental values, i.e., F 1 < S 1 . To complete the description of the model, we need to specify the organization of the arbitrage industry and the relationship between arbitrageurs and their investors, which determines F 2 . Recall that we are focusing on a particular narrow market segment in which a given set of arbitrageurs specialize. A "segment" here should be interpreted as a particular arbitrage strategy. We assume that there are many such segments and that within each segment there are many arbitrageurs, so that no arbitrageur can affect asset prices in a segment. For simplicity, we can think of T investors each with one dollar available for investment with arbitrageurs. We are concerned with the aggregate amount F 2 ≪ T that is invested with the arbitrageurs in a particular segment. Arbitrageurs compete in the price they charge for their services. For simplicity, we assume constant marginal cost per dollar invested, such that all arbitrageurs in all segments have the same marginal cost. We also assume that each arbitrageur has at least one competitor who is viewed as a perfect substitute, so that Bertrand competition drives price to marginal cost. Each of the T risk-neutral investors allocates his $1 investment to maximize expected consumer surplus, i.e., the difference between the expected return on his dollar and the price charged by the arbitrageur. Investors are Bayesians, who have prior beliefs about the expected return of each arbitrageur. Since prices are equal, an investor gives his dollar to the arbitrageur with the highest expected return according to his beliefs. Different investors hold different beliefs about various arbitrageurs' abilities, so one arbitrageur does not end up with all the funds. The market share of each arbitrageur is just the total fraction of investors who believe that he has the highest expected return. The total share of money allocated to a given segment is just the sum of these market shares across all arbitrageurs in the segment. Importantly, we assume that arbitrageurs across many segments have, on average, earned high enough returns to convince investors to invest with them rather than to index.1 The key remaining question is how investors update their beliefs about the future expected returns of an arbitrageur. We assume that investors have no information about the structure of the model-determining asset prices in any segment. In particular, they do not know the trading strategy employed by any arbitrageur. This assumption is meant to capture the idea that arbitrage strategies are difficult to understand, and a lot of specialized knowledge is needed for investors to evaluate them. In part, this is because arbitrageurs do not share all their knowledge with investors, and cultivate secrecy to protect their knowledge from imitation. Even if the investors were told more about what arbitrageurs were doing, they would have a difficult time deciding whether what they heard was true. Implicitly, we are assuming that the underlying structural model is sufficiently nonstationary and high dimensional that investors are unable to infer the underlying structure of the model from past returns data. As a result, they only use simple updating rules based on past performance. In particular, investors are assumed to form posterior beliefs about future returns of the arbitrageur based only on their prior and any observations of his arbitrage returns. Under these informational assumptions, individual arbitrageurs who experience relatively poor returns in a given period lose market share to those with better returns. Moreover, since all arbitrageurs in a given segment are taking the same positions, they all attract or lose investors simultaneously depending on the performance of their common arbitrage strategy. Specifically, investors' aggregate supply of funds to the arbitrageurs in a particular segment at time 2 is an increasing function of arbitrageurs' gross return between time 1 and time 2 (call this performance-based-arbitrage or PBA). Denoting this function by G, and recognizing that the return on the asset is given by p 2 / p 1 , the arbitrageurs' supply of funds at t = 2 is given by: F 2 = F 1 * G { ( D 1 / F 1 ) * ( p 2 / p 1 ) + ( F 1 - D 1 ) / F 1 } , with G ( 1 ) = 1 , G ′ ≥ 1 , and G ′ ′ ≤ 0. (4) If arbitrageurs do as well as some benchmark given by performance of arbitrageurs in other markets, which for simplicity we assume to be zero return, they neither gain nor lose funds under management. However, they gain (lose) funds if they outperform (under perform) that benchmark. Because of the extremely poor quality of investors' information, past performance of arbitrageurs completely determines the resources they get to manage, regardless of the actual opportunities available in their market. The responsiveness of funds under management to past performance (as measured by G ′ ) is the solution to a signal extraction problem in which investors are trying to ascribe an arbitrageur's poor performance to one of three causes: 1) a random error term, 2) a deepening of noise trader sentiment (bad luck), and 3) inferior ability. High cross-sectional variation in ability across arbitrageurs will tend to increase the responsiveness of invested funds to past performance. On the other hand, if the variance of the noise trader sentiment term is high relative to the variation in (unobserved) ability, this will tend to decrease the responsiveness to past performance. In the limit, if ability is known or does not vary across arbitrageurs, poor performance could be ascribed only to a deepening of the noise trader shock (or a pure noise term), which would only increase the investor's estimate of the arbitrageur's future return. The seemingly perverse behavior of taking money away from an arbitrageur after noise trader sentiment deepens, i.e., precisely when his expected return is greatest, is a rational response to the problem of trying to infer the arbitrageur's (unobserved) ability and future opportunities jointly from past returns. Since our results do not rely on the concavity of the G function, we focus on a linear G, given by G ( x ) = a x + 1 - a , with a ≥ 1 , (5) where x is arbitrageur's gross return. In this case, equation (4) becomes: F 2 = a { D 1 * ( p 2 / p 1 ) + ( F 1 - D 1 ) } + ( 1 - a ) F 1 = F 1 - a D 1 ( 1 - p 2 / p 1 ) . (6) With this functional form, if p 2 = p 1 , i.e. the arbitrageur earns a zero net return, he neither gains nor loses funds under management. If p 2 > p 1 , he gains funds and if p 2 < p 1 , he loses funds. Note also that the higher is a, the more sensitive are the resources under management to past performance. The case of a = 1 corresponds to the arbitrageur not getting any more money when he loses some, whereas if a > 1 , funds are actually withdrawn in response to poor performance. One could in principle imagine more complicated incentive contracts that would allow arbitrageurs to signal their opportunities or abilities and attract funds based not just on past performance. For example, arbitrageurs who feel that they have superior investment opportunities might try to offer investors contracts that pay arbitrageurs a fixed price below marginal cost and a share of the upside. That is, if, at a particular point of time, arbitrageurs believe that they can earn extremely high returns with a high probability (as happens artificially at t = 2 in our model), they can try to attract investors by partially insuring them against further losses. We do not consider such "separating" contracts in our model, since they are unlikely to emerge in equilibrium under plausible circumstances. First, with limited liability or risk aversion, arbitrageurs might be unwilling or unable after mispricing worsens to completely retain (or increase) funds under management by insuring the investor against losses, or pricing below marginal cost. Second, these contracts are less attractive when the risk-averse arbitrageur himself is highly uncertain about his own ability to produce a superior return. We could model this more realistically by adding some noise into the third period return. In sum, under plausible conditions, the use of incentive contracts does not eliminate the effect of past performance on the market shares of arbitrageurs.2 Empirically, most money managers in the pension and mutual fund industries work for fees proportional to assets under management and rarely get a percentage of the upside.3 As documented by Ippolito (1992) and Warther (1995), for example, mutual fund managers lose funds under management when they perform poorly. Interestingly, Warther (1995) also shows that fund flows in and out of mutual funds affect contemporaneous returns of securities these funds hold, consistent with the results established below. PBA is critical to our model. In conventional arbitrage, capital is allocated to arbitrageurs based on expected returns from their trades. Under PBA, in contrast, capital is allocated based on past returns, which, in the model, are low precisely when expected returns are high. At that time, arbitrageurs face fund withdrawals, and are not very effective in betting against the mispricing. Breaking the link between greater mispricing and higher expected returns perceived by those allocating capital drives our main results. To complete the model, we need to set up an arbitrageur's optimization problem. For simplicity, we assume that the arbitrageur maximizes expected time 3 profits. Since arbitrageurs are price-takers in the market for investment services and marginal cost is constant, maximizing expected time 3 profit is equivalent to maximizing expected time 3 funds under management. For concreteness, we examine a specific form of uncertainty about S 2 . We assume that, with probability q , S 2 = S > S 1 , i.e. noise trader misperceptions deepen. With a complementary probability 1 - q , noise traders recognize the true value of the asset at t = 2 , so S 2 = 0 and p 2 = V . When S 2 = 0 , arbitrageurs liquidate their position at a gain at t = 2 , and hold cash until t = 3 . In this case, W = a ( D 1 * V / p 1 + F 1 - D 1 ) + ( 1 - a ) F 1 . When S 2 = S , in contrast, arbitrageurs third period funds are given by W = ( V / p 2 ) * [ a { D 1 * p 2 / p 1 + F 1 - D 1 } + ( 1 - a ) F 1 ] . Arbitrageurs then maximize: E W = ( 1 - q ) { a ( D 1 * V p 1 + F 1 - D 1 ) + ( 1 - a ) F 1 } + q ( V p 2 ) * { a ( D 1 * p 2 p 1 + F 1 - D 1 ) + ( 1 - a ) F 1 } (7) II. Performance-Based Arbitrage and Market Efficiency Before analyzing the pattern of prices in our model, we specify what the benchmarks are. The first benchmark is efficient markets, in which arbitrageurs have access to all the capital they want. In this case, since noise trader shocks are immediately counteracted by arbitrageurs, p 1 = p 2 = V . An alter native benchmark is one in which arbitrageurs resources are limited, but PBA is inoperative, i.e., arbitrageurs can always raise F 1 . Even if they lose money, they can replenish their capital up to F 1 . In this case, p 1 = V - S 1 + F 1 and p 2 = V - S + F 1 . Prices fall one for one with noise trader shocks in each period. This case corresponds most closely to the earlier models of limited arbitrage. There is one final interesting benchmark in this model, namely the case of a = 1 . This is the case in which arbitrageurs cannot replenish the funds they have lost, but do not suffer withdrawals beyond what they have lost. We will return to this special case below. The first order condition to the arbitrageur's optimization problem is given by: ( 1 - q ) ( V p 1 - 1 ) + q ( p 2 p 1 - 1 ) V p 2 ≥ 0 (8) with strict inequality holding if and only if D 1 = F 1 , and equality holding if D 1 < F 1 The first term of equation (8) is an incremental benefit to arbitrageurs from an extra dollar of investment if the market recovers at t = 2 . The second term is the incremental loss if the price falls at t = 2 before recoVering at t = 3 , and so they have foregone the option of being able to invest more in that case. Condition (8) holds with a strict equality if the risk of price deterioration is high enough, and this deterioration is severe enough, that arbitrageurs choose to hold back some funds for the option to invest more at time 2. On the other hand, On the other hand, equation (8) holds with a strict inequality if q is low, if p 1 is low relative to V ( S 1 is large), if p 2 is not too low relative to p 1 (S not too large relative to S 1 ). That is to say, the initial displacement must be very large and prices should be expected to recover with a high probability rather than fall further. If they do fall, it cannot be by too much. Under these circumstances, arbitrageurs choose to be fully invested at t = 1 rather than hold spare reserves for t = 2 . We describe the case in which mispricing is so severe at t = 1 that arbitrageurs choose to be fully invested as "extreme circumstances and discuss it at some length. This discussion can be summarized more formally in: Proposition 1.For a given V, S 1 , S , F 1 , and a, there is a q* such that, for q > q * , D 1 < F 1 , and for q < q * , D 1 = F 1 . If equation (8) holds with equality, the equilibrium is given by equations (2), (3), (6), and (8). If equation (8) holds with inequality, then equilibrium is given by D 1 = F 1 , p 1 = V − S 1 + F 1 , as well as equations (2) and (6). To illustrate the fact that both types of equilibria are quite plausible, consider a numerical example. Let V = 1 , F 1 = 0.2 , a = 1.2 , S 1 = 0.3 , S 2 = 0.4. For this example, q ∗ = 0.35. If q < 0.35, then arbitrageurs are fully invested and D 1 = F 1 = 0.2, so that the first period price is 0.9. In this case, regardless of the exact value of q, we have F 2 = 0.1636 and p 2 = 0.7636 if noise trader sentiment deepens, and F 2 = 0.227 and p 2 = V = 1 ifnoise trader sentiment recovers. On the other hand, if q > 0.35 , then arbitrageurs hold back some of the funds at time 1, with the result that p 1 is lower than it would be with full i