Showing posts with label High Density. Show all posts
Google’s Making Its Own Chips Now. Time for Intel to Freak Out
Monday, 17 October 2016
Posted by ARM Servers
Google’s Making Its Own Chips Now. Time for Intel to Freak Out
The
Internet’s most powerful company sent a few shock waves through the tech world
yesterday when it revealed that a new custom-designed chip helps run what is
surely the future of its vast online empire: artificial intelligence.
In
building its own chip, Google has taken yet another step along a path that has
already remade the tech industry in enormous ways. Over the past decade, the
company has designed all sorts of new hardware for the massive data centers
that underpin its myriad online services, including computer servers,
networking gear, and more. As it created services of unprecedented scope and
size, it needed a more efficient breed of hardware to run these services. Over
the years, so many other Internet giants have followed suit, forcing a seismic
shift in the worldwide hardware market.
With
its new chip, Google’s aim is the same: unprecedented efficiency. To take AI to
new heights, it needs a chip that can do more in less time while consuming less
power. But the effect of this chip extends well beyond the Google empire. It
threatens the future of commercial chip makers like Intel and
nVidia—particularly when you consider Google’s vision for the future. According
to Urs Hölzle, the man most responsible for the global data center network that
underpins the Google empire, this new custom chip is just the first of many.
No,
Google will not sell its chips to other companies. It won’t directly compete
with Intel or nVidia. But with its massive data centers, Google is by far the
largest potential customer for both of those companies. At the same time, as
more and more businesses adopt the cloud computing services offered by Google,
they’ll be buying fewer and fewer servers (and thus chips) of their own, eating
even further into the chip market.
Indeed,
Google revealed its new chip as a way of promoting the cloud services that let
businesses and coders tap into its AI engines and build them into their own
applications. As Google tries to sell other companies on the power of its AI,
it’s claiming—in rather loud ways—that it boasts the best hardware for running
this AI, hardware that no other company has.
Google’s
Need for Speed
Google’s
new chip is called the Tensor Processing Unit, or TPU. That’s because it helps
run TensorFlow, the software engine that drives the Google’s deep neural
networks, networks of hardware and software that can learn particular tasks by
analyzing vast amounts of data. Other tech giants typically run their deep
neural nets with graphics processing units, or GPUs—chips that were originally
designed to render images for games and other graphics-heavy applications.
These are well-suited to running the types of calculations that drive deep
neural networks. But Google says it has built a chip that’s even more
efficient.
According
to Google, it tailored the TPU specifically to machine learning so that it
needs fewer transistors to run each operation. That means it can squeeze more
operations into the chip with each passing second.
For
now, Google is using both TPUs and GPUs to run its neural nets. Hölzle declined
to go into specifics on how exactly Google was using its TPUs, except to say
that they handle “part of the computation” needed to drive voice recognition on
Android phones. But he said that Google would be releasing a paper describing
the benefits of its chip and that Google will continue to design new chips that
handle machine learning in other ways. Eventually, it seems, this will push
GPUs out of the equation. “They’re already going away a little,” Hölzle says.
“The GPU is too general for machine learning. It wasn’t actually built for
that.”
That’s
not something nVidia wants to hear. As the world’s primary seller of GPUs,
nVidia is now pushing to expand its own business into the AI realm. As Hölzle
points out, the latest nVidia GPU offers a mode specifically for machine
learning. But clearly, Google wants the change to happen faster. Much faster.
The
Smartest Chip
In
the meantime, other companies, most notably Microsoft, are exploring another
breed of chip. The field-programmable gate array, or FPGA, is a chip you can
re-program to perform specific tasks. Microsoft has tested FPGAs with machine
learning, and Intel, seeing where this market was going, recently acquired a
company that sells FPGAs.
Some
analysts think that’s the smarter way to go. An FPGA provides far more
flexibility, says Patrick Moorhead, the president and principal analyst at Moor
Insights and Strategy, a firm that closely follows the chip business. Moorhead
wonders if the new Google TPU is “overkill,” pointing out that such a chip
takes at least six months to build—a long time in the incredibly competitive
marketplace in which the biggest Internet companies compete.
But
Google doesn’t want that flexibility. More than anything, it wants speed. Asked
why Google built its chip from scratch rather than using an FPGA, Hölzle said:
“It’s just much faster.”
Core
Business
Hölzle
also points out that Google’s chip doesn’t replace CPUs, the central processing
units at the heart of every computer server. The search giant still needs these
chips to run the tens of thousands of machines in its data centers, and CPUs
are Intel’s main business. Still, if Google is willing to build its own chips
just for AI, you have to wonder if it would go so far as to design its own CPUs
as well.
Hölzle
plays down the possibility. “You want to solve problems that are not solved,”
he says. In other words, CPUs are a mature technology that pretty much works as
it should. But he also said that Google wants healthy competition in the chip
market. In other words, it wants to buy from many sellers—not just, say, Intel.
After all, more competition means lower prices for Google. As Hölzle explains,
expanding its options is why Google is working with the OpenPower Foundation,
which seeks to offer chip designs that anyone can use and modify.
That’s
a powerful idea, and a potentially powerful threat to the world’s biggest chip
makers. According to Shane Rau, an analyst with research firm IDC, Google buys
about 5 percent of all server CPUs sold on Earth. Over a recent year-long
period, he says, Google bought about 1.2 million chips. And most of those
likely came from Intel. (In 2012, Intel exec Diane Bryant told WIRED that
Google bought more server chips from Intel than all but five other
companies—and those were all companies that sell servers.)
Whatever
its plans for the CPU, Google will continue to explore chips specifically
suited to machine learning. It will be several years before we really know what
works and what doesn’t. After all, neural networks are constantly evolving as
well. “We’re learning all the time,” he says. “It’s not clear to me what the
final answer is.” And as it learns, you can bet that the world’s chip makers
will be watching.
New Cavium ThunderX2 adopts 64-bit ARM-based servers to address application and workload requirements
Friday, 22 July 2016
Posted by ARM Servers
Semiconductor
vendor Cavium announced Monday ThunderX2, its second generation of workload
optimized ARM server SoCs that targets high performance volume servers deployed
by public/private cloud and telecom communications data centers and high
performance computing applications. It is optimized for data center workloads
such as compute, security, storage, data analytics, network function
virtualization and distributed databases.
The ThunderX2 line of processors currently includes four workload optimized processors targeting different workloads.
The ThunderX2_CP has been optimized for cloud compute workloads such as private and public clouds, web serving, web caching, web search, commercial HPC workloads such as computational fluid dynamics (CFD) and reservoir modeling. This line supports multiple 10/25/40/50/100 GbE network Interfaces and PCIe Gen3 interfaces. It also includes accelerators for virtualization and vSwitch offload.
The ThunderX2_ST has been optimized for big data, cloud storage, massively parallel processing (MPP) databases and Data warehousing workloads. This family supports multiple 10/25/40/50/100 GbE network interfaces, PCIe Gen3 interfaces and SATAv3 interfaces. It also includes hardware accelerators for data protection/ integrity/security, user to user efficient data movement.
The ThunderX2_SC has been optimized for secure web front-end, security appliances and cloud RAN type workloads. This family supports multiple 10/25/40/50/100 GbE interfaces and PCIe Gen3 interfaces. Integrated hardware accelerators include Cavium’s industry leading, 5th generation NITROX security technology with acceleration for IPSec, RSA and SSL.
The ThunderX2_NT has been optimized for media servers, scale-out embedded applications and NFV type workloads. This family supports multiple 10/25/40/50/100 GbE interfaces. It also includes OCTEON style hardware accelerators for packet parsing, shaping, lookup, QoS and forwarding.
“The Cavium ThunderX2 will expand the market opportunity for ARM-based server technologies by addressing demanding application and workload requirements for compute, storage networking and security,” said Simon Segars, CEO, ARM. “ThunderX2 demonstrates Cavium’s ability to deliver a combination of innovation and engineering execution and the new product family increases the momentum for server deployments powered by ARM processors in large scale data centers and end user environments.”
Cavium’s ThunderX2 SoC line is supported by a comprehensive software ecosystem ranging from platform level systems management and firmware to commercial operating systems, development environments and applications.
Cavium has actively engaged in server industry standards groups such as UEFI and delivered numerous reference platforms to an array of community and corporate partners. Cavium has also demonstrated its position in the open source software community driving upstream kernel enablement for ThunderX, actively contributing to Linaro’s enterprise and networking groups, investing in Linux Foundation projects such as Xen and OPNFV and sponsoring the FreeBSD Foundation’s ARMv8 server implementation.
ThunderX2 will deliver two to three times the performance across a range of standard benchmarks and applications compared to ThunderX, while boosting the market reach of the ThunderX line of processors by targeting applications that require high single thread performance such as web search, graph analytics, a variety of enterprise applications such as massively parallel processing (MPP) databases, data warehousing and enterprise HPC applications such as computational fluid dynamics (CFD) and reservoir modelling. ThunderX2 will deliver comparable performance at a better total cost of ownership compared to the next generation of traditional server processors.
The ThunderX2 line of processors currently includes four workload optimized processors targeting different workloads.
The ThunderX2_CP has been optimized for cloud compute workloads such as private and public clouds, web serving, web caching, web search, commercial HPC workloads such as computational fluid dynamics (CFD) and reservoir modeling. This line supports multiple 10/25/40/50/100 GbE network Interfaces and PCIe Gen3 interfaces. It also includes accelerators for virtualization and vSwitch offload.
The ThunderX2_ST has been optimized for big data, cloud storage, massively parallel processing (MPP) databases and Data warehousing workloads. This family supports multiple 10/25/40/50/100 GbE network interfaces, PCIe Gen3 interfaces and SATAv3 interfaces. It also includes hardware accelerators for data protection/ integrity/security, user to user efficient data movement.
The ThunderX2_SC has been optimized for secure web front-end, security appliances and cloud RAN type workloads. This family supports multiple 10/25/40/50/100 GbE interfaces and PCIe Gen3 interfaces. Integrated hardware accelerators include Cavium’s industry leading, 5th generation NITROX security technology with acceleration for IPSec, RSA and SSL.
The ThunderX2_NT has been optimized for media servers, scale-out embedded applications and NFV type workloads. This family supports multiple 10/25/40/50/100 GbE interfaces. It also includes OCTEON style hardware accelerators for packet parsing, shaping, lookup, QoS and forwarding.
“The Cavium ThunderX2 will expand the market opportunity for ARM-based server technologies by addressing demanding application and workload requirements for compute, storage networking and security,” said Simon Segars, CEO, ARM. “ThunderX2 demonstrates Cavium’s ability to deliver a combination of innovation and engineering execution and the new product family increases the momentum for server deployments powered by ARM processors in large scale data centers and end user environments.”
Cavium’s ThunderX2 SoC line is supported by a comprehensive software ecosystem ranging from platform level systems management and firmware to commercial operating systems, development environments and applications.
Cavium has actively engaged in server industry standards groups such as UEFI and delivered numerous reference platforms to an array of community and corporate partners. Cavium has also demonstrated its position in the open source software community driving upstream kernel enablement for ThunderX, actively contributing to Linaro’s enterprise and networking groups, investing in Linux Foundation projects such as Xen and OPNFV and sponsoring the FreeBSD Foundation’s ARMv8 server implementation.
ThunderX2 will deliver two to three times the performance across a range of standard benchmarks and applications compared to ThunderX, while boosting the market reach of the ThunderX line of processors by targeting applications that require high single thread performance such as web search, graph analytics, a variety of enterprise applications such as massively parallel processing (MPP) databases, data warehousing and enterprise HPC applications such as computational fluid dynamics (CFD) and reservoir modelling. ThunderX2 will deliver comparable performance at a better total cost of ownership compared to the next generation of traditional server processors.
Cavium Rolls Out ThunderX Servers with GIGABYTE Technology
Wednesday, 20 July 2016
Posted by ARM Servers
Today
GIGABYTE Technology and Cavium announced a new set of servers built on the
industry-leading ThunderX family of workload-optimized ARM server SoCs.
According to Cavium, the collaboration brings the world’s most powerful 64-bit
ARM-based servers to market to address increasingly demanding application and
workload requirements.
"The momentum for ARM-based servers is building and the new range of server products from GIGABYTE and Cavium enhances choice for companies seeking to match compute needs with the most energy and cost-effective solutions,” said Lakshmi Mandyam, senior marketing director of server program, ARM. “It is excellent to see ARM partners at the heart of driving innovative solutions that are delivering to the rigorous demands of cloud data center application and workload diversity.”
The server launch comes on the heels of news that ARM is being acquired by Softbank investment group in Japan. At a joint event in Shanghai, GIGABYTE and Cavium officially announced the release of a range of 14 server SKUs – the results of co-operation based on the Cavium ThunderX platform, utilizing GIGABYTE’s almost 20 years of experience in the server industry. With these products, the partnership has produced a compelling, high performance alternative to the incumbent solutions in the market. GIGABYTE and Cavium had the honor of inviting ecosystem partners – ARM, Innodisk, Linaro, Qlogic, Red Hat, and Suse – all of which have committed resource to bringing ARM-based servers to the mainstream enterprise market – as guest speakers at the event. GIGABYTE and Cavium are working with these stakeholders to bring higher performance-per-dollar to the server market and open up a range of potential new applications.
This solution targets high performance volume servers deployed by Public/Private Cloud and Telco data centers. It is optimized for key Data Center workloads including compute, security, storage, and distributed databases. GIGABYTE ThunderX servers deliver comparable performance at a more compelling TCO than traditional x86 server systems.
Key GIGABYTE ThunderX Server Features:
- - Adoption of the first dual-socket ARM SoC architecture that scales
up to 48 cores per processor with up to 2.0 GHz core frequency
- - The highest integrated I/O capability with up to 160Gb of I/O
bandwidth
- - Four DDR4 72 bit memory controllers capable of supporting up to 1TB
of memory in a dual socket configuration at 2133MHz
- - Best in class performance per watt and performance per dollar for
storage and compute applications
- - A comprehensive range of designs, from cost-focused entry level
solutions to high density storage and compute focused platforms
-
"GIGABYTE has developed and is already shipping a range of Cavium ThunderX-based server products to customers in US, Europe and Asia,” said Andy Chen, AVP, Network and Communications Business Unit, GIGABYTE. “Our comprehensive portfolio of ThunderX-based systems is available for order and a number of customers have already received production units. We are seeing strong demand for these ARM-based platforms – especially from cloud service providers".