How fast is state of the art HFT trading systems today?

I'm the CTO of a small company that makes and sells FPGA-based HFT systems. Building our systems on-top of the Solarflare Application Onload Engine (AOE) we have been consistently delivering latency from an "interesting" market event on the wire (10Gb/S UDP market data feed from ICE or CME) to the first byte of the resultant order message hitting the wire in the 750 to 800 nanosecond range (yes, sub-microsecond). We anticipate that our next version systems will be in the 704 to 710 nanosecond range. Some people have claimed slightly less, but that's in a lab environment and not actually sitting at a COLO in Chicago and clearing the orders.

The comments about physics and "speed of light" are valid but not relevant. Everybody that is serious about HFT has their servers at a COLO in the room next to the exchange's server.

To get into this sub-microsecond domain you cannot do very much on the host CPU except feed strategy implementation commands to the FPGA, even with technologies like kernel bypass you have 1.5 microseconds of unavoidable overhead... so in this domain everything is playing with FPGAs.

One of the other answers is very honest in saying that in this highly secretive market very few people talk about the tools they use or their performance. Every one of our clients requires that we not even tell anybody that they use our tools nor disclose anything about how they use them. This not only makes marketing hard, but it really prevents the good flow of technical knowledge between peers.

Because of this need to get into exotic systems for the "wicked fast" part of the market you'll find that the Quants (the folks that come up with the algorithms that we make go fast) are dividing their algos into event-to-response time layers. At the very top of the technology heap are the sub-microsecond systems (like ours). The next layer are the custom C++ systems that make heavy use of kernel bypass and they're in the 3-5 microsecond range. The next layer are the folks that cannot afford to be on a 10Gb/S wire only one router hop from the "exchange", they may be still at COLO's but because of a nasty game we call "port roulette" they're in the dozens to hundreds of microsecond domain. Once you get into milliseconds it's almost not HFT any more.

Cheers


You've received very good answers. There's one problem, though - most algotrading is secret. You simply don't know how fast it is. This goes both ways - some may not tell you how fast they work, because they don't want to. Others may, let's say "exaggerate", for many reasons (attracting investors or clients, for one).

Rumors about picoseconds, for example, are rather outrageous. 10 nanoseconds and 0.1 nanoseconds are exactly the same thing, because the time it takes for the order to reach the trading server is so much more than that.

And, most importantly, although not what you've asked, if you go about trying to trade algorithmically, don't try to be faster, try to be smarter. I've seen very good algorithms that can handle whole seconds of latency and make a lot of money.


Good article which describes what is the state of HFT (in 2011) and gives some samples of hardware solutions which makes nanoseconds achievable: Wall Streets Need For Trading Speed: The Nanosecond Age

With the race for the lowest “latency” continuing, some market participants are even talking about picoseconds–trillionths of a second.

EDIT: As Nicholas kindly mentioned:

The link mentions a company, Fixnetix, which can "prepare a trade" in 740ns (i.e. the time from an input event occurs to an order being sent).


"sub-40 microseconds" if you want to keep up with Nasdaq. This figure is published here http://www.nasdaqomx.com/technology/


For what its worth, TIBCO's FTL messaging product is sub-500 ns for within a machine (shared memory) and a few micro seconds using RDMA (Remote Direct Memory Access) inside a data center. After that, physics becomes the main part of the equation.

So that is the speed at which data can get from the feed to the app that makes decisions.

At least one system has claimed ~30ns interthread messaging, which is probably a tweaked up benchmark, so anyone talking about lower numbers is using some kind of magic CPU.

Once you are in the app, it is just a question of how fast the program can make decisions.