Google could have needed about 12 new data centers, but a small chip prevented this. The search giant made this chip, called a Tensor Processing Unit (TPU), to meet is massive technology demands for running deep neural networks.
Why such a chip was needed
Four years ago when the search giant forayed into voice recognition on Android phones, a big concern was overloaded commands. There was no capacity to handle this. If each Android phone user sends a voice command for just three minutes, then double the number of current data centers would be needed, the company stated.
Instead of developing more data centers, the company focused its efforts on developing hardware to handle such loads, and the TPU is the result of those efforts. Compared to standard processors, it’s capable of performing a task 30 to 80 times in the TOPS/Watt measure (a metric of efficiency).
“It makes sense to have a solution there that is much more energy efficient,” says Norm Jouppi, one of the more than 70 engineers who worked on the chip.
The custom chip saved the company from building 15 new data centers, according to Wired..
What is a TPU?
The world learned about the ASIC for the first time in May 2016. Google’s TPU offers speeds 15 to 30 times faster and 30 to 80 times better performance per watt compared to NVIDIA’s K80 GPU and Intel’s Haswell, notes EE Times.
Jouppi and team have released a paper which explains the project in detail, such as chip operation and the problems that it resolves. The new AI chip is used exclusively for executingneural networks when the user gives a command using an Android phone.
Google tested its chips on six different applications built on a neural network, representing 95% of the total applications in its data centers. The chip was tested over applications such as DeepMind Alpha Go, the system that outperformed Leed Sedol at the game of Go in a five-game match last year, notes PC World. Further, it was tested against hardware that was launched more or less in the same time period in order to create a level field for comparison.
The 40-W TPU is a 28-nm chip running at 700 MHz, customized to increase Google’s TensorFlow algorithm. The chip’s main logic unit has 65,536 8-bit multiply-accumulate units and a 24-Mbyte cache, delivering 92 tera-operations/second.
Google considered FPGAs too?
Jouppi stated that initially, Google was considering moving its neural networks over FPGAs, the type used at Microsoft. It was because it would not have taken a very long time, and the company could have programmed the chips for various other tasks as needed. However, initial tests concluded that those chips would not provide the required increased speed.
“There’s a lot overhead with programmable chips,” and FPGAs would have been no better than GPUs in terms of speed.
Google is already using TPUs in its data centers. However, there is no information on how broadly they are deployed or any planned enhancements.