How One Piece of Hardware Took Down a $6 Trillion Stock Market

An anonymous reader quotes a report from Bloomberg on how a data storage and distribution device brought down Tokyo’s $6 trillion stock market: At 7:04 a.m. on an autumn Thursday in Tokyo, the stewards of the world’s third-largest equity market realized they had a problem. A data device critical to the Tokyo Stock Exchange’s trading system had malfunctioned, and the automatic backup had failed to kick in. It was less than an hour before the system, called Arrowhead, was due to start processing orders in the $6 trillion equity market. Exchange officials could see no solution. The full-day shutdown that ensued was the longest since the exchange switched to a fully electronic trading system in 1999. It drew criticism from market participants and authorities and shone a spotlight on a lesser-discussed vulnerability in the world’s financial plumbing — not software or security risks but the danger when one of hundreds of pieces of hardware that make up a trading system decides to give up the ghost.

The TSE’s Arrowhead system launched to much fanfare in 2010, billed as a modern-day solution after a series of outages on an older system embarrassed the exchange in the 2000s. The “arrow” symbolizes speed of order processing, while the “head” suggests robustness and reliability, according to the exchange. The system of roughly 350 servers that process buy and sell orders had had a few hiccups but no major outages in its first decade. That all changed on Thursday, when a piece of hardware called the No. 1 shared disk device, one of two square-shaped data-storage boxes, detected a memory error. These devices store management data used across the servers, and distribute information such as commands and ID and password combinations for terminals that monitor trades. When the error happened, the system should have carried out what’s called a failover — an automatic switching to the No. 2 device. But for reasons the exchange’s executives couldn’t explain, that process also failed. That had a knock-on effect on servers called information distribution gateways that are meant to send market information to traders.

At 8 a.m., traders preparing at their desks for the market open an hour later should have been seeing indicative prices on their terminals as orders were processed. But many saw nothing, while others reported seeing data appearing and disappearing. They had no idea if the information was accurate. At 8:36 a.m., the bourse finally informed securities firms that trading would be halted. Three minutes later, it issued a press release on its public website — although only in Japanese. A confusingly translated English release wouldn’t follow for more than 90 minutes. It was the first time in almost fifteen years that the exchange had suffered a complete trading outage. The Tokyo bourse has a policy of not shutting even during natural disasters, so for many on trading floors in the capital, this experience was a first.

After trading was called off for the day, four TSE executives held a press conference, “discussing areas such as systems architecture in highly technical terms,” reports Bloomberg. “They also squarely accepted responsibility for the incident, rather than trying to deflect blame onto the system vendor Fujitsu Ltd.”

One of the biggest questions that remained unanswered is whether the same kind of hardware-driven failure could happen in other stock markets. “There’s nothing uniquely Japanese about this,” said Nicholas Smith of CLSA Ltd. in Tokyo. “I think we’ve just got to put that in the box of ‘stuff happens.’ These things happen. They shouldn’t, but they do.”