Showing posts with label HPC Components. Show all posts
In
the ideal hyperscaler and cloud world, there would be one processor
type with one server configuration and it would run any workload that
could be thrown at it. Earth is not an ideal world, though, and it takes
different machines to run different kinds of workloads.
In fact, if Google is any measure –
and we believe that it is – then the number of different types of compute that
needs to be deployed in the datacenter to run an increasingly diverse
application stack is growing, not shrinking. It is the end of the General
Purpose Era, which began in earnest during the Dot Com Boom and which started
to fade a few years ago even as Intel locked up the datacenter with its Xeons,
and the beginning of the Cambrian Compute Explosion era, which got rolling as
Moore’s Law improvements in compute hit a wall. Something had to give, and it
was the ease of use and volume economics that come from homogeneity enabled by
using only a few SKUs of the X86 processor.
Bart Sano is the lead of the
platforms team inside of Google, which has been around almost as long as Google
itself, but Sano has only been at the company for ten years. Sano reports to
Urs Hölzle, senior vice president of technical infrastructure at the search
engine and cloud giant, and is responsible for the designs the warehouse-scale
computers, including the datacenters themselves, everything inside of them
including compute and storage, and the network hardware and homegrown software
that interconnects them.
As 2016 wound down, Google made
several hardware announcements, including bringing GPUs from Nvidia and AMD to
Cloud Platform, and also that it would be getting out “Skylake” Xeon processors
on Cloud Platform ahead of the official Intel launch later this year and that
certain machine learning services on Cloud Platform are running on its custom
Tensor Processing Unit (TPU) ASICs or GPUs. In the wake of these announcements, The
Next Platform sat
down with Sano to have a chat about Google’s hardware strategies, and more
specifically about how the company leverages the technology it has created for
search engines, ad serving, media serving, and other aspects of the Google
business for the public cloud.
Timothy Prickett Morgan: How big of a deal are the Skylake Xeons? We think they
might be the most important processor to come out of Intel since the “Nehalem”
Xeon 5500s way back in 2009.
Bart Sano: We
are really excited about deploying Skylake, because it is a material difference
for our end customers, who are going to benefit a lot from the higher
performance, the virtualization it provides in the cloud environment, and the
computational enhancements that it has in the instruction set for SIMD processing
for numerical computations. Skylake is an important improvement for the cloud.
Again, I think the broader context here is that Google is really committed to
working on and providing the best infrastructure not only for Google, but also
to the benefit of our cloud. We are trying to ensure that cloud customers
benefit from all of the efforts inside of Google, including machine learning
running on GPUs and TPUs.”
TPM: Google
was a very enthusiastic user of AMD Opterons the first time around, and I have seen the motherboards because Urs showed them to me,
but to my way of thinking about this, 2017 is one of the most interesting years
for processing and coprocessing that we have seen in a long, long time. It is a
Cambrian Explosion of sorts. So the options are there, and clearly Google has
the ability to design and have others build systems and put your software on
lots of different things. It is obvious that Skylake is the easiest thing for
Google to endorse and move quickly to. But has Google made any commitment to
any of these other architectures? We know about Power9 and the work Google is
doing there, but has Google said it will add Zen Opterons into the mix, or is
it just too early for that?
Bart Sano: I
can say that we are committed to the choice of these different architectures,
including X86 – and that includes AMD – as well as Power and ARM. The principle
that we are investing in heavily is that competition breeds innovation, which
directly benefits our end customers. And you are right, this year is going to
be a very interesting year. There are a lot of technologies coming out, and
there will be a lot of interesting competition.
TPM: It
is easy for me to conceive of how some of these other technologies might be
used by Google itself, but it is harder for me to see how you get cloud
customers on board at an infrastructure level with some of these alternatives
like Power and ARM because they have to get their binaries ported and tuned for
them. What is the distinction between the timing for a technology that will be
used by Google internally and one that will be used by Cloud Platform?
Bart Sano: Our
end goal is that whatever technology that we are going to bring forward to the
benefit of Google we will bring to bear on the cloud. We view cloud as just
another product pillar of Google. So if something is available to ads or search
or whatever, it will be available to cloud. Now, you are right, not all of the
binaries will be highly optimized, but as it relates to Google’s binaries, our
intention is to make all of these architectures equally competitive. You are
right, there is obviously a lag in the porting efforts on this software. But
our ultimate goal is to get all of them on equal footing.”
TPM: We
know that Intel has already had early release on the Skylake Xeons, and we
assume that some HPC shops and other hyperscalers and cloud builders like
Google have early access to these processors already. So my guess is that you
have been playing in the labs with Skylakes since maybe September or October
last year, tops. When do you deploy internally at Google with Skylake and when
do you deploy to the cloud?
Bart Sano: I
can’t speak to the specifics internally, but what I can say is that the cloud
will have Skylake in early 2017. That is all that I can really say with
precision. But you would assume that we have had these in the labs and we will
do a lot of testing before we made an announcement.”
TPM: My
guess is that Intel will launch in June or July, and that you can’t have them
much before March in production on GCP, and that January or February of this
year is just not possible. . . .
Bart Sano: “We
could make some bets.” [Laughter]
TPM: Are
you doing special SKUs of Skylake Xeons, or do you use stock CPUs.
Bart Sano: I
can’t talk about SKUs and such, but what I can say is that we have Skylake.
[Laughter]
TPM: AMD
is obviously pleased that Google has endorsed its GPUs as accelerators. What is
the nature of that deal?
Bart Sano: It
is about choice, and what architecture is best for what workloads. We think
there are cases where the AMD will provide a good choice for our end customers.
It is always good to have those options, and not everything fits onto one
architecture, whether it is Intel, AMD, Nvidia or even our own TPUs. You said
it best in that this is an explosion of diversity. Our position is that we should
have as many of these architectures as possible as options for our customers
and let competition choose which one is right for different customers.
“The cloud infrastructure is not remarkably different from
the internal Google infrastructure – and it should not be because we are trying
to leverage the cost structures of both together and the lessons we learn from
the Google businesses.”
TPM: Has
Google developed its own internal framework that spans all of these different
compute elements, or do you have different frameworks for each kind of compute?
There is CUDA for Nvidia GPUs, obviously, and you can use ROCm from AMD to do
CL or move CUDA onto its Polaris GPUs. There is TensorFlow for deep learning,
and other frameworks. My assumption is that Google is smart about this, so
prove me right.
Bart Sano: It
is a challenge. For certain workloads, we can leverage common pipes. But for
the end customers, there is an issue in that there are different stacks for the
different architectures, and it is a challenge. We do have our own internal
ways to try to get commonality so we are able to run more efficiently, at least
from a programmer perspective it is all taken care of by the software. I think
that what you are pointing out is that if cloud customers have binaries and
they need to run them, we have to be able to support that. That diversity is
not going to go away.
We are trying to get the marketplace
to get more standardization in that area, and in certain domains we are trying
to abstract it out so it is not as big of an issue – for instance, with
TensorFlow for machine learning. If that is adopted and we have Intel support
TensorFlow, then you don’t have that much of a problem. It just becomes a
matter of compilation and it is much more common.
TPM: What
is Google’s thinking about the Knights family of processors and coprocessors,
especially the current “Knights Landing” and the future “Knights Crest” for
machine learning and “Knights Hill” for broader-based HPC?
Bart Sano: Like
I said, we have a basic tenet that we do not turn anything away and that we
have to look at every technology. That is why choice is so important. We want
to choose wisely, because whatever we put into the infrastructure is going to
go to not only our own internal customers, but the end customers of our cloud
products. We look at all of these technologies and assess them according to
total cost of ownership. Whether it is for search, ads, geo, or whatever
internally or for the cloud, we are constantly assessing all technologies.
TPM: How
do you manage that? Urs told me that Google has three different server designs
each year coming out of the labs into production, and servers stay in
production for several years. It seems to me that if you start adding more
different kinds of compute, it will be more complex and expensive to build and
support all of this diversity of machinery. If you have an abstraction layer in
the software and a build process that lets applications be deployed to any type
of compute, that makes it easier. But you still have an increasing number of
type and configuration of machines. Doesn’t this make your manufacturing and
supply chain more complex, too?
Bart Sano: You
are right, having all of these different SKUs makes it difficult to handle. In
any infrastructure, you have a mix of legacy versus current versus new stuff,
and the software has to abstract that. There are layers in the software stack,
including Borg internally or Kubernetes on the cloud as well as others.
TPM: You
can do a lot in the hardware, too, right? An ARM server based on ThunderX from
Cavium has a similar BIOS and baseboard management controller as a Xeon server,
and ditto for a “Zaius” Power9 machine like the one that Google is creating in
conjunction with Rackspace Hosting. You can get the form factors the same, and
then you are differentiating in other aspects of the system such as memory
bandwidth or capacity. But we have to assume that the number of servers that
Google is supporting is still growing as more types of compute are added to the
infrastructure.
Bart Sano: The
diversity is growing, and when we helped found the OpenPower consortium, we
knew what that meant. And the implications are a heterogeneous environment and
much more operational complexity. But this is the reality of the world that we
are entering. If we are to be the solution to more than just the products of
Google, we have to support this diversity. We are going into it with our eyes
wide open.
TPM: Everybody
assumes that Google has lots of Nvidia GPU for accelerating machine learning
and other workloads, and now you have Radeon GPUs from AMD. Have you been doing
this for a long time internally and now you are just exposing it on Cloud
Platform?
Bart Sano: We
have actually been doing the GPUs and the TPUs for a while, and we are now
exposing it to the cloud. What became apparent is that the cloud customer
wanted them. That was the question: would people want to come to the cloud for
this sort of functionality.
The cloud infrastructure is not remarkably
different from the internal Google infrastructure – and it should not be
because we are trying to leverage the cost structures of both together and the
lessons we learn from the Google businesses.
TPM: With
the GPUs and TPUs, are you exposing them at an infrastructure level, where
customers can address them directly like they would internally on their own
iron, or are they exposed at a platform level, where customers buy a Google
service that they just pour data into and run and they never get under the hood
to see?
Bart Sano: The
GPUs are exposed at more of an infrastructure level, where you have to see them
to run binaries on them. It is not like a platform, and customers can pick
whether they want Nvidia or AMD GPUs. They will be available attached to
Compute Engine virtual machines, and for the Cloud Machine Learning services.
For those who want to interact at that level, they can. We provide support for
TPUs at a higher level, with our Vision API for image search service or
Translation API for language translation, for example. They don’t really
interact with TPUs, per se, but with the services that run on them.
TPM: What
design does Google use to add GPUs to its servers? Does it look like an IBM
Power Systems LC or Nvidia DGX-1 system? Do you have a different way of
interconnecting and pooling GPUs than what people currently are doing? Have you
adopted NVLink for lashing together GPUs?
Bart Sano: I
would say that GPUs require more interconnection and cluster bandwidth. I can’t
say that we are the same as the examples you are talking about, but what I can
say is that we match the configuration so the GPUs are not starved for memory
bandwidth and communication bandwidth. We have to architect these systems so
they are not starved. As for NVLink, I can’t go into details like that.
TPM: I
presume that there is a net gain with this increasing diversity. There is more
complexity and lower volumes of any specific server type, but you can precisely
tune workloads for specific hardware. We see it again and again that companies
are tuning hardware to software and vice versa because general purpose,
particularly at Google’s scale, is not working anymore. Unit costs rise, but
you come out way ahead. Is that the way it looks to Google?
Bart Sano: That
is the driver behind why we are providing these different kinds of computation.
As for the size of the jump in price/performance, it is really different for
different customers.
Tech giant Dell EMC has announced a new collection of high performance computing (HPC) cloud offerings, software and systems to make more HPC services available to enterprises of all sizes, optimize HPC technology innovations, and advance the HPC community.
"The global HPC market forecast exceeds $30 billion in 2016 for all product and services spending, including servers, software, storage, cloud, and other categories, with continued growth expected at 5.2 percent CAGR through 2020," said Addison Snell, CEO of Intersect360 Research, in a statement. "Bolstered by its combination with EMC, Dell will hold the number-one position in total HPC revenue share heading into 2017."
Democratizing HPC
Among the new products and services is the new HPC System for Life Sciences, which will be available with the PowerEdge C6320p Server (pictured above) by the first quarter 2017. The company said the new life sciences service accelerates results for bioinformatics centers to identify treatments in clinically relevant timeframes while protecting confidential data.
"Highly parallelized computing plays an important role in high performance computing," said Ed Turkel, HPC Strategist at Dell EMC, in the statement. "Compared to serial computing, parallel computing is much better suited for modeling, simulating and understanding complex, real world phenomena. In many cases, serial programs 'waste' potential computing power." The PowerEdge C6320p Server is specifically designed to address this parallel processing environment to drive improved performance and faster big data analysis, Turkel said.
The company also said that it will begin offering new cloud bursting services from Cycle Computing to enable cloud orchestration and management between some of the largest public cloud services, including Azure and AWS. Dell said the service allows customers to more efficiently utilize their on-premises systems while providing access to the resources of the public cloud for HPC needs.
The company will also offer customers the Intel HPC Orchestrator later this quarter to help simplify the installation, management and ongoing maintenance of high-performance computing systems. HPC Orchestrator, which is based on the OpenHPC open source project, can help accelerate enterprises' installations and management.
Optimizing the HPC Porftolio
Dell EMC has been increasingly placing its bets on HPC services, unveiling a portfolio of several new HPC technologies earlier this month. For example, the company introduced its PowerEdge C4130 and R730 servers designed to boost throughput and improve cost savings for HPC and hyperscale data centers to support more deep learning applications and artificial intelligence techniques in technological and scientific fields such as DNA sequencing.
"Dell EMC is uniquely capable of breaking through the barriers of data-centric HPC and navigating new and varied workloads that are converging with big data and cloud," said Jim Ganthier, senior vice president, Validated Solutions and HPC Organization, Dell EMC, in the statement. "We are collaborating with the HPC community, including our customers, to advance and optimize HPC innovations while making these capabilities easily accessible and deployable for organizations and businesses of all sizes."
High Performance Computing Market - Opportunities and Forecasts, 2014 - 2022
Monday, 17 October 2016
Posted by ARM Servers
High
Performance Computing is a practice to aggregate computing power that
delivers high performance capabilities in handling large number problems
in science, business or engineering fields.HPC systems involve all
types of servers and micro servers that are being used for highly
computational or data intensive tasks. Currently, as HPC has been firmly
linked to the economic competitiveness and scientific advances it is
becoming important to nations. The worldwide study showcases, 97% of the
companies have adopted supercomputing platforms and says that they
won’t survive without it.
Faster computing capabilities of micro servers or HPC systems,
improved performance efficiency and smarter deployment & management with high
quality of service are some key factors driving the growth of HPC market. The
major challenges for these HPC systems are power, cooling system management and
storage & data management. The importance of storage & data management
would continue to grow in future. In additions to this, software hurdles
continues to grow, which are restraining the growth of HPC market. HPC
technology is being rapidly adopted by the academic institutions and various
industries to build reliable and robust products that would enable to maintain
a competitive edge in the business. Various vendors are also targeting to
provide high performance converged technology solutions. As this trend is
gaining significant relevance, the market is growing steadily and it would
continue its growth in future.
High Performance Computing market analysis by Components
HPC involves various components and some of them could be listed
as Hardware and architecture, software and system management and professional
services. Hardware components are the most essential parts in any HPC system.
The efficiency of the system is totally dependent on the hardware entities in
HPC. Hardware and architecture segment of HPC includes memory capacity
(storage), energy management, servers and network devices. Servers consist of
super computer, divisional, departmental &workgroup. Supercomputers and
departmental units are the fastest elements to be sold in hardware and
architecture section. Another essential component of HPC is software and
management system. It comprises of middleware, programming tools, performance
optimization tools, cluster management and fabric management. Finally,
professional services provided are design & consulting, integration &
deployment and Training & outsourcing.
High Performance Computing market analysis by Deployment
The different types of deployment methods of HPC are Cloud based
and on-premise based methods. Cloud deployment is most popular in the industry,
as cloud-computing technologies are popularly adopted by the players in
different industries. The research shows that cloud technology market is
expected to grow due to its high adoption rate, while the usage on-premise
deployment method would decline slowly.
High Performance Computing market analysis by Application
The major application sections of HPC are High Performance
technical computing and High performance business computing. Technical
computing of the HPC includes various sectors such as Government, Chemicals,
Bio-sciences, Academic institutions, Consumer products, Energy, Electronics and
Others. High performance data analysis is being used in government sector for
national security & crime fighting. In addition to this, HPCs are used in
fraud detection and customer acquisition/retention across other sectors. High
Performance Business Computing includes media entertainment, online gaming,
retail, financial service, ultra scale internet, transportation and others.
High Performance Computing market analysis by Geography
The high performance computing market is being analyzed in
different geographic regions such as North America, Europe, Asia-Pacific and
LAMEA. North America is the largest market for HPC technology due to the
technological advancements and early adoption of technology in the region
followed by Europe.
Competitive Landscape
The key market players are adopting product launch as their
principle strategy to provide high performance solutions in different
industries. Cisco is providing high performance computing solution for
financial services that overcome low latency requirements, high message rate and
throughput requirements, predictability to avoid jitter & spikes and
building large computing grids in cost effective manner.
Some major players in HPC market are IBM, Intel, Fujistu, AMD,
Oracle, Microsoft, HP, Dell, Hitachi Data System and Cisco.
HIGH LEVEL ANALYSIS
Study of the market showcases the current market trends, market
structures, driving factors, limitations and opportunities of the global HPC
market. Porter’s Five Force Model helps in analyzing the market forces,
barriers, strengths, etc., of the global market. Bargaining power of the buyer
is low as the product is highly differentiated and threat of backward
integration is low. The suppliers in this market are more concentrated than
buyers, due to which the bargaining power of suppliers is high. Threat of
substitutes in the global market is high as the switching costs are minimal. As
HPC is a novel concept, threat of new entrants in the industry is high, while,
the moderate number of market players leads to moderate intersegment rivalry in
the market. Value chain analysis helps in analyzing the role of key
stakeholders in the supply chain of the market and would provide new entrants
with knowledge about the value chain of the existing market.
KEY BENEFITS
- Porters five force’s model helps in analyzing the potential of buyers & suppliers, and the competitive sketch of the market, which would guide the market players to develop strategies accordingly
- Assessments are made according to the current business scenario and the future market structure & trends are forecast for the period 2013-2020 by considering 2013 as base year
- The analysis gives a wider view of the global market including its market trends, market structure, limiting factors and opportunities
- The advantages of the market are analyzed to help the stakeholders identify the opportunistic areas in a comprehensive manner
- The value chain analysis provides a systematic study on the key intermediaries involved, which would in turn help the stakeholders in the market to make appropriate strategies
HIGH PERFORMANCE COMPUTING MARKET KEY DELIVERABLES
Access Report @ https://www.wiseguyreports.com/reports/512543-world-high-performance-computing-market-opportunities-and-forecasts-2014-2022
About Us
Wise Guy Reports is part of the Wise
Guy Consultants Pvt. Ltd. and offers premium progressive statistical surveying,
market research reports, analysis & forecast data for industries and
governments around the globe. Wise Guy Reports understand
how essential statistical surveying information is for your organization or
association. Therefore, we have associated with the top publishers and
research firms all specialized in specific domains, ensuring you will receive
the most reliable and up to date research data available.
Contact Us:
Norah Trent
+1 646 845 9349 / +44 208 133 9349
Norah Trent
+1 646 845 9349 / +44 208 133 9349