Story image

Google’s scalable supercomputers now publicly available

08 May 2019

In what it says is a bid to accelerate the largest-scale machine learning (ML) applications deployed today, Google has opened up its supercomputers.

The global tech giant has created silicon chips called Tensor Processing Units (TPUs), which when assembled into multi-rack ML supercomputers called Cloud TPU Pods can complete ML workloads in minutes or hours that previously took days or weeks on other systems.

Now, Google Cloud TPU v2 Pods and Cloud TPU v3 Pods are publicly available in beta to help ML researchers, engineers, and data scientists iterate faster and train more capable machine learning models.

“Google Cloud is committed to providing a full spectrum of ML accelerators, including both Cloud GPUs and Cloud TPUs. Cloud TPUs offer highly competitive performance and cost, often training cutting-edge deep learning models faster while delivering significant savings,” says Google Brain Team Cloud TPUs senior product manager Zak Stone.

The benefits for ML teams building complex models and training on large data sets, Stone says, include shorter time to insight, higher accuracy, frequent model updates, and rapid prototyping.

“While some custom silicon chips can only perform a single function, TPUs are fully programmable, which means that Cloud TPU Pods can accelerate a wide range of state-of-the-art ML workloads, including many of the most popular deep learning models,” says Stone.

“Cloud TPU customers see significant speed-ups in workloads spanning visual product search, financial modeling, energy production, and other areas. In a recent case study, Recursion Pharmaceuticals iteratively tests the viability of synthesized molecules to treat rare illnesses. What took over 24 hours to train on their on-prem cluster completed in only 15 minutes on a Cloud TPU Pod.”

According to Stone, a single Cloud TPU Pod can contain more than 1,000 individual TPU chips which are connected by an ultra-fast, two-dimensional toroidal mesh network. The TPU software stack then uses this mesh network to enable many racks of machines to be programmed as a single, giant ML supercomputer via a variety of flexible, high-level APIs.

“The latest-generation Cloud TPU v3 Pods are liquid-cooled for maximum performance, and each one delivers more than 100 petaFLOPs of computing power. In terms of raw mathematical operations per second, a Cloud TPU v3 Pod is comparable with a top 5 supercomputer worldwide (though it operates at lower numerical precision),” says Stone.

“It’s also possible to use smaller sections of Cloud TPU Pods called ‘slices.’ We often see ML teams develop their initial models on individual Cloud TPU devices (which are generally available) and then expand to progressively larger Cloud TPU Pod slices via both data parallelism and model parallelism to achieve greater training speed and model scale.”

Edge computing market to provide ‘lucrative opportunities’
The market is set to skyrocket in the coming years, paving the way for emerging market players.
Opinion: 3 ways cloud & colocation providers can use renewables
Schneider Electric’s John Powers discusses the renewable revolution that is underway and how providers can jump on board.
Former CBRE data centre head joins EkkoSense board
Data centre expert Mark Acton will be strengthening the board as a non-executive director.
DC BLOX launches new partner program to drive expansion
Multi-tenant data center provider DC BLOX has announced the launch of its new channel partner program.
$50b modular data centre market driven by edge computing
Findings from a new research report have been released by Global Market Insights that show a burgeoning market.
Verizon makes major step towards Multi-Access Edge Compute
In a trial environment in California, the wireless provider achieved full virtualisation of baseband functions.
Interview: Edge computing - the force powering hyperconverged infrastructure
Scale Computing CEO Jeff Ready talks offerings, plans for the future, and a look as edge computing as the next tech innovation.
Symantec, Ixia combine efforts to secure hybrid networks
Ixia’s CloudLens and Symantec Security Analytics now feature complete integration, which allows Symantec customers to gain real-time visibility into their hybrid cloud environments.