Technology is becoming more and more of an influential factor in the lives of ordinary people around the globe, and the internet has expanded in such a way that living without it has, in some countries, even become impossible. Currently, Artificial Intelligence and Quantum Computing are on the verge of breakthroughs and could potentially become as influential in society as the internet has become in our daily lives. Correspondingly, in the world of finance, the rise of the internet and subsequent technological developments are greatly impacting financial markets. For instance, transactions have become electronic, and the time it takes to execute a trade has decreased to milliseconds, and even nanoseconds. In addition, a new custom-built chip that can execute trades within 740 nanoseconds is being launched by Fitnetix, a UK-based company. According to Johnson et al. (2012), this technological race is likely to be pushed further until the physical limits of the speed of light are met.
The Rise of Algorithmic Trading
Among these technological developments in financial markets, automated trading might be the most present-day and prominent revolution. An algorithm can be defined as a precise plan of steps that uses computations to transform input values into an output value (Leshik & Cralle, 2011). Supply and demand on stock markets are increasingly in the hands of these computational algorithms that fully autonomously decide to buy or sell a stock on behalf of its “owner.” As presented in Figure 1 by Glantz & Kissel (2013, p. 258), the percentage of market volume attributable to algorithmic trading has risen greatly in the past twenty years, with asset managers, high-frequency traders, and hedge funds accounting for most of the volume (Glantz & Kissel, 2013). Our proxy for algorithmic trading based on CRSP data supports these findings and also shows a clear rise in algorithmic trading activity, as observed in Figure 2.
Figures
Figure 1: Algorithmic trading as a percentage of market volume. Reprinted from: Multi-asset risk modeling: techniques for a global economy in an electronic and algorithmic trading era, by M. Glantz, & R. Kissel, 2013, p. 258, Copyright by Academic Press.
Figure 2: Proxy for Algorithmic Trading based on CRSP data.
Historical Context of Algorithmic Trading
Nevertheless, algorithmic trading is still a new topic, and even though its foundation can be traced back to 1949, it has only become widely spread in the last two decades (Leshik & Cralle, 2011). To give an example, if one searches “algorithmic trading” on Google Scholar (Date: 18/7/2017), only 500 results will appear that contain “algorithmic trading” in the title, of which most are working papers, and only 20 of these were written before 2005. When put into context, these 500 papers and books amount to only 0.08% of the 67,000 articles that hold “financial crisis” in their name.
For this reason, many of the sources used remain books and working papers as information on algorithmic trading is still limited. However, according to Kaya (2016), in 2014 high-frequency trading already accounted for 49 percent of all volume in U.S. equity markets, where one must keep in mind that high-frequency trading is merely a subgroup of algorithmic trading.
Human Influence vs. Algorithmic Trading
The connection between algorithmic trading and its effects on human aspects is barely touched upon within existing financial literature. It is likely that algorithmic trading, in combination with improved artificial intelligence and quantum computing, will completely change the financial markets as we know them now. Its relevance is undeniable, and yet so little is known about how the automation revolution impacts financial markets. Quantum computing and artificial intelligence still lie in the future; nevertheless, human traders are already being substituted by computers on a great scale, and its effects should be measurable using quantitative data. Measuring the effects of algorithmic trading is likely to give insights into how financial markets will behave in the future.
The rise of algorithmic trading implies that a decline in direct human influence has manifested itself within the financial markets. Trading algorithms differ in trading behavior from human investors in that trading algorithms are assumed to never deviate from their set of predefined rules unless stated within those rules. In other words, a trading algorithm will always behave within its programmed boundaries while accounting for all the information delivered to it. On the other hand, human traders are more likely to act based on their intuition and what is happening in their environment, with the tendency to value certain information above others.
These influences can be identified as behavioral biases, which are recurring patterns in human behavior that simplify the predictability of their behavior (Heiner, 1983). Humans are rational but only boundedly so and are often attracted to a majority opinion (Kahneman, 2003). In the world of finance, this pull of social gravity to the majority opinion, together with bounded rationality, causes the amplification of inefficiencies in the stock market as investors consistently keep overpricing popular stocks and underpricing less favored equities (Deman & Lufkin, 2000). Furthermore, Kim and Kim (2014) state that investor sentiment is affected by historical share price performance, which further strengthens market inefficiencies. Considering that the stock market is already to a certain extent inefficient, it is likely that investor sentiment is often biased because of unrepresentative share prices, which then again could lead to more inaccurate forecasts. Additionally, Chaboud, Chiquoine, Hjalmarsson & Vega (2014) find evidence that “algorithmic trading contributes to a more efficient price discovery process via the elimination of triangular arbitrage opportunities.” All in all, it can be assumed that the market is becoming more efficient with the increased influence of algorithms.
Algorithmic Trading and Market Efficiency
According to the efficient market hypothesis developed by Fama (1995), this development should reinforce the random walk of stock prices and consequently their unpredictability. Research on price dispersion related to algorithmic trading has not been performed previously, and the most connected literature is on transaction cost dispersion by Enge, Russel & Ferstenberg (2007), where only Morgan Stanley data instead of complete stock market data is used. Furthermore, the link between algorithmic trading and market predictability also has no predecessors and will explore new terrain in the field of algorithmic trading, using the fundamental relationships between algorithmic trading, market quality, and information previously researched by Hendershott, Jones & Menkveld (2011) and Lyle & Naughton (2015).
For this reason, the main theme of this study is to evaluate how increased algorithmic trading has affected analysts’ capabilities to predict future market movements. Removing emotional entities from the market is expected to improve market efficiency and hence decrease market predictability. Moreover, another sub-question is used to develop an empirical foundation for answering the main question, which sums up to: Does algorithmic trading lead to less price dispersion within the stock market?
Chaboud et al. (2014) show that automated trading strategies are less diverse than strategies used by human investors and that humans are responsible for a larger part of the variance in returns than their algorithmic counterparts. It follows that as algorithms possess more similarities than human traders, it leads to the suspicion that the size of the range of returns, also known as dispersion, has decreased with increased algorithmic trading. Moreover, when looking at our data graphically, it can be observed that return dispersion shows a clear downtrend over time, except for some extreme values during the financial crisis in 2008/2009 (see Figure 3). Additionally, regressing dispersion against time confirms the downward slope, resulting in a negative statistically significant coefficient on time with a p-value of 0.001. Considering that algorithmic trading increased over time, it could imply a relation with dispersion.
Figure
Figure 3: Dispersion against time
Investigating Algorithmic Trading’s Impact
The current study investigates the effects of algorithmic trading in more detail by systematically performing fixed effects panel data regressions. This might enable us to see how increased algorithmic trading has affected return dispersion and market predictability. The regression findings lead to the conclusion that dispersion is indeed reduced through increased algorithmic trading. Furthermore, it is found that more algorithmic trading led to smaller prediction errors and hence improved market predictability.
Research Questions
In the next chapter, the theoretical framework that was used to establish this research will be discussed, built on the following research questions:
- Does increased algorithmic trading within the market affect analysts’ capabilities to predict future market movements?
- Sub-question: Does algorithmic trading lead to less price dispersion in the stock market?
Theoretical Background
Current State of Literature
To determine the influence of algorithmic trading on dispersion and market predictability, first of all, the origins of trading algorithms and the use of automated trading systems must be investigated. Additionally, to find how fewer human traders impact market predictability and dispersion, financial behavioral biases and market predictability should be examined as well.
Algorithmic Trading and Automated Trading Systems (ATS)
Leshik & Cralle (2011) explain that algorithms used for trading can be traced back to 1949 when Alfred Winslow Jones used an algorithm to balance between long and short positions on a hedge fund. An algorithm can be defined as a precise plan of steps that uses computations to transform the input values into an output value. Fundamental to computer software and computations, algorithms have become a mainstream aid to the daily trader. It was not until the 1980s that algorithmic or black box trading became hugely profitable due to the invention of Pair Trading. Decreased costs, improved control mechanisms with self-documenting trade records, and speed of execution are some of the advantages that algorithmic trading can offer to increase the likelihood of a trade turning out successfully.
First of all, in order to understand how exactly financial markets are affected by algorithmic trading, it is necessary to get to the very basis of how a trading algorithm works. For that reason, an example algorithm for a coke vending machine is introduced. The algorithm can be constructed as simply as:
bash
if sum of COINS_INSERTED > $1 then RETURN(sum of COINS_INSERTED – 1)
if sum of COINS_INSERTED = $1 then DROP_CAN
if sum of COINS_INSERTED < $1 then SHOW_MESSAGE(Insufficient Amount)
if ABORTED then RETURN(COINS_INSERTED)
In this example, the amount of coins inserted is the main input; its total value instructs the vending machine to drop the coke can and return any change if necessary. The algorithm will simply follow the set of rules to transform input into output and never deviate from these rules during the process. Similarly to the example algorithm, trading algorithms are merely a set of predefined rules that convert input into output. Hence, trading algorithms are implemented within Automated Trading Systems that facilitate data collection to obtain input values and to transform output values into an actual action.
Automated Trading Systems, also known as ATS, are a combination of both hardware and software that, by using trading algorithms, manages orders and positions within a stock portfolio based on real-time data feeds and historical data stored in a database. The data input usually is a combination of factors such as the share price, volume, number of trades, technical indicators, and even news events can serve as an input value for the more advanced learning algorithms (van Vliet, 2007). It follows that the Automated Trading System autonomously creates orders based on its input values and implements these on the exchange, all within milliseconds, competing with human investors (van Vliet, 2007). Hence, it can be argued that an ATS is to a trading algorithm what a physical coke vending machine can be considered to be to a coke vending algorithm.
To construct an ATS, one has to be familiar with computer science, quantitative finance, trading strategy, and quality management. As “data is the lifeblood of electronic markets,” the basis of ATS lies in the underlying data, which can be managed using Microsoft Visual C++ or .NET applications. Technological superiority through ATS can offer an enormous advantage against competitors, but it still does not imply profitability (van Vliet, 2007).
Leshik and Cralle (2011) consider the most popular and widely used algorithms to be Volume Weighted Average Price (VWAP), Time Weighted Average Price (TWAP), Percentage of Volume (POV), Search for Liquidity (Black Lance), Stay Parallel with the Market (The PEG), Large Order Hiding (Iceberg), Pair Trading Strategy, Leshik-Cralle, Recursive, Serial, Parallel, and Iterative. Whereas Izumi, Toriumi & Matsui (2009) evaluated a distinct set of automated trading strategies. Izumi et al. compare the risk and return of all strategies within their sample set and concluded that the strategies provide better information than conventional methods. Moreover, the research showed that the impact of automated trading strategies on markets does not merely depend on their code. Additionally, the way they are combined and influence each other can impact the market more so.
The common factor amongst almost all popular trading algorithms seems to lie in technical analysis, as the most popular trading algorithms are largely based on technical analysis-related indicators such as moving average and the relative strength index as main indicators to create the buy or sell decision. Technical analysis pertains to predicting future stock prices by studying past stock price performance and several other trading statistics like trading volume and the number of trades (Brock, Lakonishok & LeBaron, 1992).
Technical analysis is often considered non-scientific due to its non-fundamental nature; nonetheless, a survey study by Menkhoff (2010) proves that the vast majority of all fund managers rely on technical analysis. Additionally, Bessembinder & Chan (1997) demonstrate that even rather simple technical analysis holds statistically significant forecasting power within financial markets. Technical analysis is more related to psychology than fundamentals, and the more inductive technical analysis is used, the more it reinforces its own predictive powers, almost like a self-fulfilling prophecy.
Figure
Figure 4: Standard deviations versus Returns of ATS. Reprinted from “Evaluation of automated-trading strategies using an artificial market.” By K. Izumi, F. Toriumi & H. Matsui, 2009, 72(16), 3474.
In Figure 4, the risk and return outcome of the automated trading strategies agents tested by Izumi et al. (2009, p. 3474) are displayed, partially to illustrate some available strategies other than the ones mentioned by Leshik & Cralle (2011). The results were achieved using backtesting on several stock markets. For these trading strategies to work, several parameters for the input variables can be used. It is elementary that the parameters take on values that reflect the price level of fundamental information to the firm and economic conditions and preferably use adaptive agents. The parameters and code used by Izumi et al. (2009) can be found in Appendix B. Moreover, from the parameters, it can be derived that actual trading algorithms are very similar to the coke vending machine example algorithm illustrated above. For most of these algorithms, technical indicators based on price or volume information, such as moving averages or upper and lower bands, are used as input values.
Incorporating Machine Learning into ATS
Not only can ATS use price and volume information or technical indicators as input values, but the algorithms can also be integrated with machine learning to automatically read news feeds and turn these into input values for the algorithm. According to Nuij et al. (2014), automating the incorporation of news feeds into stock trading strategies can boost the returns of individual technical indicators compared to those without the incorporation of news messages. By extracting an event from a news feed text and pairing it with an impact based on historic stock price deviations for a specific event, this news variable can be used in addition to existing technical indicators.
Subsequently, the rules created through news-associated events can be mutated within the trading algorithm by improved versions of the rules that have led to higher returns. Such automatic reprogramming on the basis of previous return outcomes is one example of how machine learning can be implemented in ATS.
Predictability & Biases in Behavioral Finance
Algorithmic trading is connected to behavioral finance in the sense that algorithms are often programmed to trade on investor biases that exist due to individual or group behavior. The technical indicators incorporated in trading algorithms function through behavioral finance. Therefore, it could even be argued that technical economic indicators are actually socio-economic indicators. Behavioral finance often contradicts the efficient market theory, suggesting that stock prices are actually to a certain extent predictable because of psychological and social concepts that cause inefficiencies in the stock market (Shiller, 2003).
There is polarity in human behavior that reflects how stocks oscillate between up and down trends, similar to the state of mind and mood that a human or group of humans are in. All forms of emotion seem to exert forces on the stock market in one way or another. To name an example, even reaching physical new highs in the form of a tall building reverberates in the stock market by leaving a peak in the graph followed by a fall. The Dubai stock market rose significantly after finishing the Burj Khalifa, the world’s tallest building (Mitroi, 2014). Moreover, there are recursive patterns for some financial anomalies, such as the day-of-the-week effect, which are not yet understood. Evidence seems to suggest that these anomalies happen because of mass psychology (Shiller, 2003).
Vasiliou, Eriotis & Papathanasiou (2008) mention that moving averages stress where a trend is headed and flatten out fluctuations caused by the noise of irrational investors, also known as noise traders. Additionally, Vasiliou et al. find that the utility of the technical trading rules used in their research improved over time.
Market Efficiency and Predictability
Litzenberger, Castura & Gorelick (2012) stated that market quality has improved in the past decades. A clear cause for this trend is increased competition through more automation and high-frequency trading in the market, which leads to decreases in bid and ask spreads and improved liquidity. This improved liquidity causes the orders in limit order books to be exercised at a faster pace. Moreover, when relating market quality to algorithmic trading, Lyle, Naughton, and Weller (2015) discovered that algorithmic trading strategies that provide liquidity, such as market-making strategies, increase market quality. Whereas liquidity-taking, non-market maker algorithmic trading activity harms market quality. Bouchaud, Farmer & Lillo (2008) conclude prices in markets sustain close to perfect unpredictability in the short run. Firstly, considering that outstanding liquidity is always small, meaning that prices do not immediately mirror all information available to the market. Secondly, on electronic markets, there is no possibility to distinguish informed and uninformed trades, for all trades have the same impact. It follows that all informative aspects of a trade should be internal to the market, meaning that trades, order flow, and cancellations carry information.
Beja and Goldman (1980) rightfully state that a market constructed by humans can impossibly be so mechanically perfect and efficient that all information would directly be integrated into the prices before it can be observed. This implies that price anomalies will always be present, leaving room for predictability. Moreover, Pesaran (2003) reinforces predictability by stating that “A large number of studies in the finance literature have confirmed that stock returns can be predicted to some degree by means of interest rates, dividend yields and a variety of macroeconomic variables exhibiting clear business cycle variations.” According to Pesaran, market efficiency should be distanced from predictability.
Methodology
Data Collection & Processing
Most of the data and queries used for the research have been obtained through Wharton University of Pennsylvania’s WRDS database & query tool (Wharton Research Data Services). In this research, three different datasets are used that exist within the WRDS database, named: CRSP – Daily Stock, IBES – Price Target, and Federal Reserve Bank – Interest Rates. These sub-datasets eventually will be merged before the hypotheses can be tested and will be elaborated on in the following section. Further details on the datasets can be obtained from Table A1, where all query extraction specifications are denoted.
The chosen data period from 1999 to 2017 is a trade-off between covering a period as extensive as possible while at the same time trying to keep the data editable within Stata using the limited computing power that the research has at its disposal. Moreover, since IBES data is only available from 1999 onwards, this will automatically be the start of the period. Furthermore, it can be argued using Glantz & Kissel’s (2013, p. 258) Figure 1 that algorithmic trading before 1999 would have amounted to such a small percentage of the market volume that it is not of critical value in answering the research question.
Additionally, only NASDAQ and NYSE equity price data is used, as the U.S.-based stock exchanges were first in establishing facilities to support the development of algorithmic trading. Consequently, high-frequency trading gained volume share in the US more rapidly than in Europe, as shown in Figure 5 (Kaya, 2016, p. 2). Given these arguments and considering the limited computing power, U.S. data on algorithmic trading follows as the more established choice.
Figures
Figure 5: % Share of High-Frequency Trading in total equity trading per year. Reprinted from “High-frequency trading: reaching the limits.” By O. Kaya, 2016, p. 2. Copyright by Deutsche Bank Research.
CRSP – Daily Stock
First of all, the daily prices and trading data, such as the daily number of trades and daily volume, are extracted from the CRSP U.S. Stock database within WRDS. The previously mentioned CRSP query will function as the master dataset within the Stata environment and contains end-of-day prices for equity securities on the NYSE and NASDAQ exchanges. Additionally, CRSP also contains quote data, holding period returns, shares outstanding, and trading volume information. Initially, the entire database is extracted for the period from 1999 to 2017, containing over 34 million observations. To start, only common stock observations are maintained within the query to improve the post-merger data compatibility with the IBES Price Target dataset. For common stock, the variable share code amounts to either 10 or 11, hence only these share codes are kept within the sample. Moreover, tickers with multiple different shares are dropped, as those are not properly comparable to the IBES identifiers, which will be elaborated on later.
Additionally, a .TXT file consisting of the remaining company ticker identifiers is derived from the dataset within Stata in order to simplify the extraction of successive queries within WRDS, as only information on those predetermined companies will be withdrawn from WRDS, thus depressing the file size. Within the daily stock price query, the actual price, bid, ask, and shares outstanding are adjusted using the so-called adjustment factors in order to make the mentioned variables comparable over the entire 1999-2017 period. These adjustment factors are constructed by CRSP and adjust for corporate actions such as stock splits, dividends, and rights offerings. Additionally, the effective spread variable is created similarly to Hendershott et al. (2011) by means of taking the difference between the closing bid and ask its midpoint and the actual transaction price of that day, as well as a volatility variable that is calculated as the deviation between the daily high and the daily low.
IBES – Price Target
IBES, also known as the Institutional Brokers’ Estimate System, is a Thomson Reuters database that holds historical analyst estimates for more than twenty forecast measures such as earnings per share, revenue, price targets, buy-hold-sell recommendations, and gross profits regarding over 60,000 companies. After completing the extraction of price target estimation data, including their horizon and analyst name data, from WRDS using the same 1995-2017 period as used before, it was found that the IBES data could not directly be merged with the CRSP data. Concerning IBES, it contains two ticker variables, and merely the variable “official ticker” is compatible with the ticker variable in CRSP and should not be confused with “ticker” in the IBES dataset. Hence, “oftic” is changed to its CRSP name: ticker.
Additionally, it must be mentioned that the in IBES so-called “announcement date” should be the leading date. Finally, price target estimation values are matched with their respective future actual price by lagging the forecast with its horizon, meaning that an estimation with a horizon of 6 months is lagged 6 months.
Federal Reserve Bank – Interest Rates
The WRDS RATES database used in this research is based upon the Federal Reserve Board’s H.15 release that contains selected interest rates for U.S. Treasuries and private money market and capital market instruments. Daily rates are per business day and reported in annual terms. To include interest rates as a controlling factor within the regressions, the rates of U.S. treasury bills with a maturity of 3 months are extracted from the WRDS RATES database for the period 1995 to 2017. The rates are merged with the master dataset using date as the common variable.
Data Analysis Methodology
To shed light on the automation process that entails the shift from human traders to automated trading systems, analyst predictions and their accuracy will be elaborated on in relation to algorithmic trading. However, first, our scope will focus on how algorithmic trading is measured and how dispersion has changed through algorithmic trading. Moreover, all independent variables that will be used in regressions are standardized to facilitate economic interpretation. Standardization is performed by subtracting the corresponding time series’ mean from the variables and dividing this deviation by the time series’ standard deviation.
By standardizing all independent variables in such a fashion, the standardized regression coefficients will represent a standard deviation change of the independent variables in the dependent variable. Hence, independent variable X is standardized such that:
Xtj′=Xtj−μ(X)σ(X)X’_{tj} = \frac{X_{tj} – \mu(X)}{\sigma(X)}
Algorithmic Trading Measure
Preparatory, a proxy has been developed to measure the development of algorithmic trading over time within the available CRSP data. To quantify algorithmic trading in a variable, Hendershott, Jones, and Menkveld (2011) and Boehmer, Fong & WU (2015) use the daily number of electronic messages from the TAQ database per $100 of trading volume as a proxy to measure algorithmic trading. It is the most established measure within academic research; however, the TAQ database is not at this research’s disposal, and hence an inferior but comparable proxy is created. Inferiority lies in the fact that electronic messaging traffic information is not available in CRSP. However, as volume data is available, the best alternative measure would be a proxy that replaces the number of electronic messages with a comparable variable. Our data shows that volume did not increase over time while the number of trades did in a comparable way to the electronic messages used in HJM’s proxy, making this a simplified but functioning replacement within our proxy for algorithmic trading. Moreover, algorithmic trading is associated with improved liquidity and an increased number of trades with smaller volume per trade (Hendershott et al., 2011).
Hence the new proxy for algorithmic trading is calculated as the daily number of trades executed for ticker j per dollar trading volume of that day derived from the CRSP database.
For it being a much noisier proxy, it gives a very similar representation of the development of algorithmic trading over time that was established by Glantz & Kissel (2013), as can be noted in Figures 1 & 2.
Effects of Algorithmic Trading on Dispersion
It is assumed that algorithms have more similarities than their human counterparts, and for this reason, dispersion is expected to decrease with more algorithmic trading. As flash crashes are known to happen with algorithmic trading (Johnson et al., 2012), extreme short-term dispersion might have increased instead. However, considering that this study is only able to use daily data, flash crashes are not expected to influence the results. Hence, the hypotheses are formulated as:
H0: Dispersion does not change with increased algorithmic trading.
H1: Dispersion changes with increased algorithmic trading.
Idiosyncratic or stock-specific volatility is used to measure dispersion. Idiosyncratic risk can be calculated in numerous ways; the various measures, however, all give comparable results (Malkiel & Xu, 2003). Moreover, according to Bello (2008), there are no significant differences between the Capital Asset Pricing Model, the Fama French Three Factor Model, and the Carhart Model regarding their outcomes. Hence, in this study, the CAPM is used to calculate idiosyncratic volatility, as this suits the dataset best. The CAPM formula used is as follows:
Finally, idiosyncratic volatility, or preferably called dispersion, is regressed on the algorithmic trading measure in line with the hypotheses to analyze if return dispersion has changed through an increase in algorithmic trading. The model is also performed while controlling for firm fixed effects and year fixed effects, as it is clear from Figure 3 that for dispersion, there seems to be quite a variance amongst different years, particularly for years of financial crisis.
The reason why fixed effects are used instead of random effects is that the Hausman test for random effects versus fixed effects is significant at the 99.9% significance level for regression, meaning that the unique errors ϵtj\epsilon_{tj} are correlated with the regressors, and hence fixed effects panel data regressions are used to analyze dispersion. In regression (8) and (9), firm fixed effects and year fixed effects are added respectively to see if and how firm and year-specific effects influence our model. Comparing the results of regressions (7) and (8) will show the effect of firm-specific effects, whereas the comparison of (8) and (9) is to display the influence of year fixed effects.
Effects of Algorithmic Trading on Analyst Forecast Accuracy
To analyze the prediction accuracy of the remaining human analysts within the market, historical Thomson Reuters analysts’ estimations obtained from the IBES dataset are used to obtain the prediction error for a certain forecast. It follows that the difference between the estimation value at time t and the adjusted price on date t, divided by the adjusted price on that date, gives the prediction error of a certain estimation by analyst i for stock j. Additionally, the prediction error is squared to emphasize the analysts who were most off in their forecasts, whether below or above. As the squared prediction error will only return positive values, it lays focus on just the deviation itself, as the direction of the deviation is not of concern.
Consecutively, the analyst prediction error variable will then be tested using regression analysis within the Stata statistical analysis software to see if analysts’ predictions have become statistically more accurate since the development of automation within stock markets. The dataset can be described as an unbalanced three-dimensional panel dataset for which stock ticker, date, and analyst name represent the dimensions; for every ticker, there are different numbers of analyst estimations on varying dates. The “missing” data is due to analysts specializing in specific stocks, and because the date at which estimations are placed is random, there is no actual missing data.
The ticker and analyst variables are combined into a new variable called “tic_alys,” where each group merely represents the specific forecasts by analyst i for ticker j. This procedure removes the need to drop the third dimension in order to run a multi-dimensional fixed effects panel data regression within Stata. These dimensions are only combined for regression (14) and (16), where firm and analyst fixed effects are included conjointly. To answer the research question, the following hypotheses are developed:
H0: Analysts’ prediction error is not influenced by increased algorithmic trading.
H1: Analysts’ prediction error is influenced by increased algorithmic trading.
These hypotheses lead to the regressions below, of which it is expected that analyst prediction error has indeed increased in the period where automation has taken place. It seems unlikely that analysts can predict the direction of future stock prices, as the analysts would have to be able to execute transactions faster than the algorithms. Therefore, it is hard to form a definite hypothesis, as algorithmic trading probably also leads to less dispersion, which could facilitate analyst predictions. For this reason, the hypothesis is two-sided, where time t is in date format and per day. Testing analyst prediction error versus algorithmic trading is the most direct way of examining the effects that algorithmic trading has on analyst forecast accuracy. As many other factors potentially affect the forecast accuracy, sufficient control variables are to be added, and fixed or random effects will be controlled for. Moreover, to determine whether the regressions need to be controlled for fixed or random effects, the Hausman test is used again. Testing for random versus fixed effects again gives a significant outcome with a 99.99% confidence level, and hence H0 is rejected, meaning that fixed effects need to be applied within the panel data regressions.
It follows that six different panel data regressions will be tested within Stata to determine how prediction error is influenced. The first regression model is a plain panel regression merely to test the effect of algorithmic trading on the analyst prediction error, whereas the remaining five are fixed effects panel data regressions that each control for a certain fixed effect. Regression (11) is the plain panel data regression; then firm fixed effects are added in (12) to see how firm-specific effects affect the regression output compared to the plain model. Thirdly, year fixed effects are controlled for as well using year dummies to control for a time trend, and comparing regression (13) with (12) should deliver insight into the effects that time exerts on the dependent variable. Successively, analyst fixed effects are controlled for in regression (14), and again by merely adding this factor to the model, it should become clear if and how the model is influenced through analyst-specific properties. By comparing the outcomes of the four regressions, it should become clear if, how, and which fixed effects affect prediction error.
Reference List
- Bello, Z. Y. (2008). Mentioned in relation to the Capital Asset Pricing Model and idiosyncratic risk.
- Bessembinder, H., & Chan, K. (1997). Referenced for demonstrating the forecasting power of technical analysis in financial markets.
- Bouchaud, J. P., Farmer, J. D., & Lillo, F. (2008). Discussed in relation to market unpredictability and information integration in prices.
- Chaboud, A. P., Chiquoine, B., Hjalmarsson, E., & Vega, C. (2014). Cited for their findings on the efficiency of price discovery with algorithmic trading.
- Fama, E. F. (1995). Cited in relation to the efficient market hypothesis and stock price predictability.
- Glantz, M., & Kissel, R. (2013). Mentioned for their work on algorithmic trading and its impact on market volume.
- Hendershott, T., Jones, C. M., & Menkveld, A. J. (2011). Discussed in the context of measuring the effects of algorithmic trading on liquidity.
- Izumi, K., Toriumi, F., & Matsui, H. (2009). Cited for their evaluation of automated trading strategies.
- Johnson, N. F., Jefferies, P., & Hui, P. M. (2012). Referenced in relation to the technological race in financial markets.
- Kahneman, D. (2003). Mentioned for his work on bounded rationality and behavioral finance.
- Kaya, O. (2016). Referenced for data on high-frequency trading in equity markets.
- Kim, K. A., & Kim, T. S. (2014). Discussed regarding investor sentiment and its impact on market inefficiencies.
- Leshik, E., & Cralle, J. (2011). Referenced for their definition and explanation of algorithmic trading.
- Lyle, M. R., & Naughton, J. P. (2015). Discussed in relation to algorithmic trading strategies and market quality.
- Malkiel, B. G., & Xu, Y. (2003). Mentioned in the context of measuring idiosyncratic risk.
- Pesaran, M. H. (2003). Cited for his statement on stock return predictability.
- Shiller, R. J. (2003). Referenced for his views on behavioral finance and market inefficiencies.
- Van Vliet, B. (2007). Mentioned in the context of automated trading systems (ATS).