Exploring High Frequency Trading
Kapil Kapre | Thursday, Oct 11, 2013

High Frequency Trading has been around for about thirteen odd years. The benefits of HFT and AT in general have always been prone to contention. The pro-HFT people say it increases market liquidity, reduces transactional costs, allows for faster price discovery and other good stuff. The anti-HFT people point to events like the flash crash where a major market dislocation can disrupt HFT activity and liquidity dries up because the algorithms block all orders as the risk profile changes. Anyway, that is not the focus of this article. I am mainly interested in the technical aspects of HFT.

Starting at the top, the volume, strategies, margin-per-trade, risks, etc. in HFT are unique as compared to regular equities trading. Even so, at the micro-level, like regular trading, people will exploit some asymmetry over other traders to make money. All else being equal, typically this speaks to the information you use to make decisions. You can have more information than the others or you get the information faster or you can use the same information better. Information could be something as elusive as an accurate valuation, or market information (predictive), trends, insights, etc. All of these are categorized under signals. HFT algorithms use these signals and execute automated trades. The 'faster' is where the technical aspects of HFT come in. As a person interested in low-level systems, I care deeply about latency, performance, and other such measures of computation. This makes HFT a topic of great interest to me. Lets look at some basics of how a order is executed.

hft diagram

The strategy engine takes in the intra-day live feed from all the exchanges as well as any external signals that are useful in making decisions on trades. It might also need to perform some error checking, arbitration, alpha-adjusting, etc. when receiving data from multiple sources. Based on this data it generates orders and feeds them into the order engine which are then sent to the exchange. The order engine along with the risk management engine has discretion over splitting the order into multiple parts, calculating the PnL risk, timing the order, routing strategies, etc. It will feed back risk information to the strategy engine for future orders. The strategy engine also interacts with a database module which stores financial models and other related information.

While its crucial for the strategy engine to react to the incoming pricing signals, it is also necessary that some processing happen offline. The time required to create strategies far exceeds the sub-millisecond response times that typical HFT trades exhibit. Strategies are pre-baked and subsequently fine-tuned based on the pricing data. Ofcource that is certainly not the end of it. Just the act of executing a pre-determined order creates a new set of challenges. The problem is - attempting to liquidate X units of assets during period T carries with it, a non-zero risk of a price increase, during period T, as a function of increased supply. This is an optimization problem where we have to choose the discrete amounts of quantities to be sold at some discrete time intervals to minimize the execution costs.

Lets look at some of the technical aspects. To arrive at a ballpark figure, we make certain assumptions. Assuming a millisecond response time on a CPU with a clock speed of 3Ghz gives us around (3 * 10^9 cycles/sec * 0.001s) cycles , or about 3 * 10^6 cycles to work with. This is a modern superscalar, pipelined CPU that can stagger the {fetch,decode,execute,writeback} work as well as parallelize it, so lets assume a Instructions Per Cycle count of about 4. With that we have 7,50,000 instructions to play with. If we look at a non FPGA solution, we're going to lose a chunk of those instructions to the OS - drivers, memory management, I/O and other misc. overhead. Lets assume that to be around 15% so we're now left with 6,37,500 instructions. So that seems like a reasonably large number. One way to blow through this time/instruction budget would be to simply sort a bunch of strings inside a std::vector. Yes, its that easy ! A natural followup thought is, well, given that its so easy to blow through this budget, what calculations could HFT algorithms possibly be performing that are so complicated? A trick question? Well, the fact is that they aren't performing any complicated calculations at all, because they cant. All the complicated calculations/models are pre-baked into the system. In some cases even the models aren't that complex. Statistical arbitrage is based around the simple concept of datamining previous historical data and finding a co-relation between past historical price movement and the current one with a high enough statistical confidence. Another generic arbitrage strategy is simply to track live data for relative temporal mispricing of securities - e.g. an ETF which tracks a particular index might lag behind the index and there is 'free' money to be made if you're quick enough.

While HFT strategies themselves might employ the use of interesting algorithms, the technical challenge in optimizing HFT infrastructure is primarily latency reduction more than anything - especially for arbitrage strategies. What are the main sources of latency? The OS runtime/scheduler, concurrent data processing, network driver latency, general I/O latency. Almost all of those can be attacked using various techniques - FPGAs, TCP/IP Offload Engines, GPGPUs, etc. Some recent products even allow for integrating the CPU with the network card and having it process data while the tcp/ip packet is being assembled. I am also interested in the algorithmic side of HFT. Maybe someday I'll set aside a chunk of time and go research that.