Showing posts with label hpc cloud. Show all posts

In the ideal hyperscaler and cloud world, there would be one processor type with one server configuration and it would run any workload that could be thrown at it. Earth is not an ideal world, though, and it takes different machines to run different kinds of workloads.

hyperscaler Computing and cloud world
In fact, if Google is any measure – and we believe that it is – then the number of different types of compute that needs to be deployed in the datacenter to run an increasingly diverse application stack is growing, not shrinking. It is the end of the General Purpose Era, which began in earnest during the Dot Com Boom and which started to fade a few years ago even as Intel locked up the datacenter with its Xeons, and the beginning of the Cambrian Compute Explosion era, which got rolling as Moore’s Law improvements in compute hit a wall. Something had to give, and it was the ease of use and volume economics that come from homogeneity enabled by using only a few SKUs of the X86 processor.
Bart Sano is the lead of the platforms team inside of Google, which has been around almost as long as Google itself, but Sano has only been at the company for ten years. Sano reports to Urs Hölzle, senior vice president of technical infrastructure at the search engine and cloud giant, and is responsible for the designs the warehouse-scale computers, including the datacenters themselves, everything inside of them including compute and storage, and the network hardware and homegrown software that interconnects them.
As 2016 wound down, Google made several hardware announcements, including bringing GPUs from Nvidia and AMD to Cloud Platform, and also that it would be getting out “Skylake” Xeon processors on Cloud Platform ahead of the official Intel launch later this year and that certain machine learning services on Cloud Platform are running on its custom Tensor Processing Unit (TPU) ASICs or GPUs. In the wake of these announcements, The Next Platform sat down with Sano to have a chat about Google’s hardware strategies, and more specifically about how the company leverages the technology it has created for search engines, ad serving, media serving, and other aspects of the Google business for the public cloud.

Timothy Prickett Morgan: How big of a deal are the Skylake Xeons? We think they might be the most important processor to come out of Intel since the “Nehalem” Xeon 5500s way back in 2009.

Bart Sano: We are really excited about deploying Skylake, because it is a material difference for our end customers, who are going to benefit a lot from the higher performance, the virtualization it provides in the cloud environment, and the computational enhancements that it has in the instruction set for SIMD processing for numerical computations. Skylake is an important improvement for the cloud. Again, I think the broader context here is that Google is really committed to working on and providing the best infrastructure not only for Google, but also to the benefit of our cloud. We are trying to ensure that cloud customers benefit from all of the efforts inside of Google, including machine learning running on GPUs and TPUs.”

TPM: Google was a very enthusiastic user of AMD Opterons the first time around, and I have seen the motherboards because Urs showed them to me, but to my way of thinking about this, 2017 is one of the most interesting years for processing and coprocessing that we have seen in a long, long time. It is a Cambrian Explosion of sorts. So the options are there, and clearly Google has the ability to design and have others build systems and put your software on lots of different things. It is obvious that Skylake is the easiest thing for Google to endorse and move quickly to. But has Google made any commitment to any of these other architectures? We know about Power9 and the work Google is doing there, but has Google said it will add Zen Opterons into the mix, or is it just too early for that?

Bart Sano: I can say that we are committed to the choice of these different architectures, including X86 – and that includes AMD – as well as Power and ARM. The principle that we are investing in heavily is that competition breeds innovation, which directly benefits our end customers. And you are right, this year is going to be a very interesting year. There are a lot of technologies coming out, and there will be a lot of interesting competition.

TPM: It is easy for me to conceive of how some of these other technologies might be used by Google itself, but it is harder for me to see how you get cloud customers on board at an infrastructure level with some of these alternatives like Power and ARM because they have to get their binaries ported and tuned for them. What is the distinction between the timing for a technology that will be used by Google internally and one that will be used by Cloud Platform?

Bart Sano: Our end goal is that whatever technology that we are going to bring forward to the benefit of Google we will bring to bear on the cloud. We view cloud as just another product pillar of Google. So if something is available to ads or search or whatever, it will be available to cloud. Now, you are right, not all of the binaries will be highly optimized, but as it relates to Google’s binaries, our intention is to make all of these architectures equally competitive. You are right, there is obviously a lag in the porting efforts on this software. But our ultimate goal is to get all of them on equal footing.”

TPM: We know that Intel has already had early release on the Skylake Xeons, and we assume that some HPC shops and other hyperscalers and cloud builders like Google have early access to these processors already. So my guess is that you have been playing in the labs with Skylakes since maybe September or October last year, tops. When do you deploy internally at Google with Skylake and when do you deploy to the cloud?

Bart Sano: I can’t speak to the specifics internally, but what I can say is that the cloud will have Skylake in early 2017. That is all that I can really say with precision. But you would assume that we have had these in the labs and we will do a lot of testing before we made an announcement.”

TPM: My guess is that Intel will launch in June or July, and that you can’t have them much before March in production on GCP, and that January or February of this year is just not possible. . . .

Bart Sano: “We could make some bets.” [Laughter]

TPM: Are you doing special SKUs of Skylake Xeons, or do you use stock CPUs.

Bart Sano: I can’t talk about SKUs and such, but what I can say is that we have Skylake. [Laughter]

TPM: AMD is obviously pleased that Google has endorsed its GPUs as accelerators. What is the nature of that deal?

Bart Sano: It is about choice, and what architecture is best for what workloads. We think there are cases where the AMD will provide a good choice for our end customers. It is always good to have those options, and not everything fits onto one architecture, whether it is Intel, AMD, Nvidia or even our own TPUs. You said it best in that this is an explosion of diversity. Our position is that we should have as many of these architectures as possible as options for our customers and let competition choose which one is right for different customers.
“The cloud infrastructure is not remarkably different from the internal Google infrastructure – and it should not be because we are trying to leverage the cost structures of both together and the lessons we learn from the Google businesses.”
TPM: Has Google developed its own internal framework that spans all of these different compute elements, or do you have different frameworks for each kind of compute? There is CUDA for Nvidia GPUs, obviously, and you can use ROCm from AMD to do CL or move CUDA onto its Polaris GPUs. There is TensorFlow for deep learning, and other frameworks. My assumption is that Google is smart about this, so prove me right.

Bart Sano: It is a challenge. For certain workloads, we can leverage common pipes. But for the end customers, there is an issue in that there are different stacks for the different architectures, and it is a challenge. We do have our own internal ways to try to get commonality so we are able to run more efficiently, at least from a programmer perspective it is all taken care of by the software. I think that what you are pointing out is that if cloud customers have binaries and they need to run them, we have to be able to support that. That diversity is not going to go away.
We are trying to get the marketplace to get more standardization in that area, and in certain domains we are trying to abstract it out so it is not as big of an issue – for instance, with TensorFlow for machine learning. If that is adopted and we have Intel support TensorFlow, then you don’t have that much of a problem. It just becomes a matter of compilation and it is much more common.
TPM: What is Google’s thinking about the Knights family of processors and coprocessors, especially the current “Knights Landing” and the future “Knights Crest” for machine learning and “Knights Hill” for broader-based HPC?

Bart Sano: Like I said, we have a basic tenet that we do not turn anything away and that we have to look at every technology. That is why choice is so important. We want to choose wisely, because whatever we put into the infrastructure is going to go to not only our own internal customers, but the end customers of our cloud products. We look at all of these technologies and assess them according to total cost of ownership. Whether it is for search, ads, geo, or whatever internally or for the cloud, we are constantly assessing all technologies.

TPM: How do you manage that? Urs told me that Google has three different server designs each year coming out of the labs into production, and servers stay in production for several years. It seems to me that if you start adding more different kinds of compute, it will be more complex and expensive to build and support all of this diversity of machinery. If you have an abstraction layer in the software and a build process that lets applications be deployed to any type of compute, that makes it easier. But you still have an increasing number of type and configuration of machines. Doesn’t this make your manufacturing and supply chain more complex, too?

Bart Sano: You are right, having all of these different SKUs makes it difficult to handle. In any infrastructure, you have a mix of legacy versus current versus new stuff, and the software has to abstract that. There are layers in the software stack, including Borg internally or Kubernetes on the cloud as well as others.

TPM: You can do a lot in the hardware, too, right? An ARM server based on ThunderX from Cavium has a similar BIOS and baseboard management controller as a Xeon server, and ditto for a “Zaius” Power9 machine like the one that Google is creating in conjunction with Rackspace Hosting. You can get the form factors the same, and then you are differentiating in other aspects of the system such as memory bandwidth or capacity. But we have to assume that the number of servers that Google is supporting is still growing as more types of compute are added to the infrastructure.

Bart Sano: The diversity is growing, and when we helped found the OpenPower consortium, we knew what that meant. And the implications are a heterogeneous environment and much more operational complexity. But this is the reality of the world that we are entering. If we are to be the solution to more than just the products of Google, we have to support this diversity. We are going into it with our eyes wide open.

TPM: Everybody assumes that Google has lots of Nvidia GPU for accelerating machine learning and other workloads, and now you have Radeon GPUs from AMD. Have you been doing this for a long time internally and now you are just exposing it on Cloud Platform?

Bart Sano: We have actually been doing the GPUs and the TPUs for a while, and we are now exposing it to the cloud. What became apparent is that the cloud customer wanted them. That was the question: would people want to come to the cloud for this sort of functionality.
The cloud infrastructure is not remarkably different from the internal Google infrastructure – and it should not be because we are trying to leverage the cost structures of both together and the lessons we learn from the Google businesses.
TPM: With the GPUs and TPUs, are you exposing them at an infrastructure level, where customers can address them directly like they would internally on their own iron, or are they exposed at a platform level, where customers buy a Google service that they just pour data into and run and they never get under the hood to see?

Bart Sano: The GPUs are exposed at more of an infrastructure level, where you have to see them to run binaries on them. It is not like a platform, and customers can pick whether they want Nvidia or AMD GPUs. They will be available attached to Compute Engine virtual machines, and for the Cloud Machine Learning services. For those who want to interact at that level, they can. We provide support for TPUs at a higher level, with our Vision API for image search service or Translation API for language translation, for example. They don’t really interact with TPUs, per se, but with the services that run on them.

TPM: What design does Google use to add GPUs to its servers? Does it look like an IBM Power Systems LC or Nvidia DGX-1 system? Do you have a different way of interconnecting and pooling GPUs than what people currently are doing? Have you adopted NVLink for lashing together GPUs?

Bart Sano: I would say that GPUs require more interconnection and cluster bandwidth. I can’t say that we are the same as the examples you are talking about, but what I can say is that we match the configuration so the GPUs are not starved for memory bandwidth and communication bandwidth. We have to architect these systems so they are not starved. As for NVLink, I can’t go into details like that.

TPM: I presume that there is a net gain with this increasing diversity. There is more complexity and lower volumes of any specific server type, but you can precisely tune workloads for specific hardware. We see it again and again that companies are tuning hardware to software and vice versa because general purpose, particularly at Google’s scale, is not working anymore. Unit costs rise, but you come out way ahead. Is that the way it looks to Google?
Bart Sano: That is the driver behind why we are providing these different kinds of computation. As for the size of the jump in price/performance, it is really different for different customers.
2017 promises to be an exciting year for servers and the competitiveness of compute offerings. This year will see this scope of impact not only include enterprise datacenters and the public cloud, but extend to the emergence of "edge computing". Edge computing is defined as compute required to deal with data at or near the point of creation. Among other things, these devices will include the “ocean” of remote, smart sensors, commonly included in internet of things (IoT) discussions.

Server CPU Predictions For 2017

Here is a list of a few things to we’ll see concerning specific CPUs.

It should come as no surprise that Intel INTC -0.16% continues to dominate (>99%) the server market but is under enormous pressure on all fronts. Xeon and its evolution continue to be their compute vanguard. Xeon-Phi (and now the addition of Nervana) make up their engines for high-performance computing / machine learning. Phi has seen some success, but it isn’t clear yet how Nervana offerings will materialize.

Advanced Micro Devices AMD -0.57% (AMD) has their best shot in years for fielding an Intel competitor that just about everyone (except perhaps Intel) is eager to see. If the AMD Zen server CPU is simply good enough (meaning, it shows up, works and has at least some performance value), it will take market share simply by being an x86 competitor. AMD is encouraged by early indicators. They also have their ATI GPGPU technology which will provide additional opportunities.

ARM Holdings will continue to dominate the mobile and embedded device space, but the fight is hard in these segments. The more likely opportunity for ARM expansion will be at the "edge" and not so much in the server space. The death of Vulcan by Avago, the acquisition of Applied Micro Circuits (APM) and their plan to find a place for X-Gene leaves the Cavium CAVM +0.69% ThunderX, the "yet to be launched" Qualcomm QCOM -0.02% Centriq CPU and a few other very focused ARM initiatives still standing. After years of "This is the Year for ARM Servers", the outlook could be better, and if AMD produces a plausible Intel competitor (capable of running x86 software), it will put extreme pressure on whole ARM server CPU initiative.

OpenPOWER seems on the other hand to have a lot of momentum but to date has not significantly impacted the x86 server market. 2017 may end on a different note. OpenPOWER‘s (IBM IBM -1.27%) willingness to embrace NVIDIA NVDA -0.74% (the darling of the machine learning segment) and embed an NV-Link interface is going to play well with much of AI and HPC communities. By the end of the year, we will have seen some interesting OpenPOWER offerings emerge based on advanced silicon process technology from a variety of sources, and 2018 may see a whole different story. Especially if an embrace from Google GOOGL -0.14%, who has been flirting with OpenPOWER for a while now, materializes and creates a tipping point.

The real challenge to all CPUs is the way they do work. Their philosophy is built on the principle that data must come into the chip, be operated on by the chip, with results or even new data being pushed out of the chip. This whole process creates a natural bottleneck that we've flirted with for decades. As the magnitude and scope of data increases, something has got to give, and a favorite candidate is more parallelism. So far, this has favored GPGPUs or accelerators.

At the bigger-picture business level for datacenters and the public cloud, the real question is not so much which CPU (in fact, the business folks probably couldn't care less), but the economics of private, public or hybrid solutions. It is safe to say enterprise computing will not disappear any time soon, and while there is much activity, the implementations and economics of hybrid solutions have proven to be difficult. According to Gartner, by 2020 more compute power will have been sold by IaaS and PaaS cloud providers than sold and deployed into enterprise datacenters. The fact that companies (especially smaller ones) are either being born in or moving to the cloud at a rapid pace is undeniable. However, NOT all are seeing the expected saving materialize from this move. 2017 will certainly see some careful thinking and maybe even some rethinking of strategy.

The explosion of data at the edge is simply going to change data processing as we know it and will create a variety of computing problems that are difficult to do in the cloud (even though the results may end up there). However, they may not be in the enterprise datacenters as we know them either, and we may find them “stuck” all over the place. For more than sixty years, we have seen compute follow the data. First from the original mainframe datacenter to the desktop, to departmental servers, into enterprise datacenters, and now significantly into the cloud. It is my opinion, that If you plan to put just your data into the cloud, economics (the cost of network usage) will drive your compute there sooner or later. You want to consider this carefully based on your actual needs and usage. There might be a better overall business outcome, depending on your size and ability to operate, in your own datacenter.

The major emerging source of data is at the edge and will drive the need for much compute there. By the way, all the CPUs mentions here should be able do the edge reasonably well so … Game on again!

Disclosure: My firm, Moor Insights & Strategy, like all research and analyst firms, provides or has provided research, analysis, advising, and/or consulting to many high-tech companies in the industry including Advanced Micro Devices, Applied Micro Circuits, ARM Holdings, IBM, Intel, NVIDIA and Qualcomm. I do not hold any equity positions with any companies cited in this column.
"We invite submissions introducing a wide range of topics, levels and considerations in HPC architectures, applications and usage – from fundamentals to the latest advances and hot topic areas. Submissions can be proposed as papers or presentation only (without papers). Each submission should indicate any unique criteria and requirements along with scheduling and format preferences in your proposal. Sessions can be defined as technical sessions, workshop(s) and/or as part of a ‘mini’ series of quick take tutorials."

HPC architectures and HPC applications
“Over two days we’ll delve into a wide range of interests and best practices – in applications, tools and techniques and share new insights on the trends, technologies and collaborative partnerships that foster this robust ecosystem. Designed to be highly interactive, the open forum will feature industry notables in keynotes, technical sessions, workshops and tutorials. These highly regarded subject matter experts (SME’s) will share their works and wisdom covering everything from established HPC disciplines to emerging usage models from old-school architectures and breakthrough applications to pioneering research and provocative results. Plus a healthy smattering of conversation and controversy on endeavors in Exascale, Big Data, Artificial Intelligence, Machine Learning and much much more!”
"Updates for Intel® Xeon® processors, Intel® HPC Orchestrator, Intel® Deep Learning Inference Accelerator and other forthcoming supercomputing technologies available soon"

Intel® HPC challenges


SC16 revealed several important pieces of news for supercomputing experts. In case you missed it, here’s a recap of announced updates from Intel that will provide even more powerful capabilities to address HPC challenges like energy efficiency, system complexity, and the ability for simplified workload customization. In supercomputing, one size certainly does not fit all. Intel’s new and updated technologies take a step forward in addressing these issues, allowing users to focus more on their applications for HPC, not the technology behind it.

intellogoIn 2017, developers will welcome a next generation of Intel® Xeon® and Intel® Xeon Phi™ processors. As you would expect, these updates offer increased processor speed and more through improved technologies under the hood. The next generation Intel Xeon Phi processor (code name “Knights Mill”) will exceed its predecessor’s capability with up to four times better performance in deep learning scenarios1.

Of course, as developers know, the currently-shipping Intel Xeon Phi processor (formerly known as “Knights Landing”) is no slouch! Nine systems utilizing this processor now reside on the TOP500 list. Of special note are the Cori (NERSC) and Oakforest-PACS (Japan Joint Center for Advanced High Performance Computing) supercomputing systems with both claiming a spot among the Top 10.

HPC customization

The next-generation Intel Xeon processor (code name “Skylake”) is also expected to join the portfolio in 2017. Demanding applications involving floating point calculations and encryption will benefit from both Intel® Advanced Vector Instructions-512, and Intel® Omni-Path Architecture (Intel® OPA). These improvements will further streamline the processor’s capability, giving commercial, academic and research institutions another step forward against taxing workloads.

A third processing technology anticipated in 2017 enables an additional level of HPC customization. The combined hardware and software solution, known as Intel® Deep Learning Inference Accelerator, sports a field-programmable gate array (FPGA) at its heart. By maximizing industry standard frameworks like Intel® Distribution for Caffe* and Intel® Math Kernel Library for Deep Neural Networks too, the solution provides end users opportunity for even greater flexibility in their supercomputing applications.

intelcircleAt SC16, Intel also highlighted supplemental momentum for Intel® Scalable System Framework (Intel SSF). HPC is an essential tool for advances in health-related applications, and Intel SSF is taking a place center-stage as a mission-critical tool in those scenarios as Intel demonstrated in its SC16 booth. Dell* offers Intel SSF for supercomputing scenarios involving drug design and cancer research. Other applications like genomic sequencing create a challenge for any supercomputer. For this reason, Hewlett Packard Enterprise* (HPE) taps Intel SSF as a core component of the HPE Next Generation Sequencing Solution.

Additional performance isn’t the only thing supercomputing experts need, though. Feedback from HPC developers, administrators and end-users express the need for improved tools during critical phases of system setup and usage. Help is on the way. Now available, Intel® HPC Orchestrator based upon the OpenHPC software stack addresses that feedback. With over 60 features integrated, it assists with testing at full-scale, deployment scenarios, and simplified systems management. Currently available through Dell* and Fujitsu*, Intel HPC Orchestrator should provide added momentum for the democratization of HPC.

Demonstrating further momentum, Intel Omni-Path Architecture has seen quite an uptick in adoption since release nine months back. It is utilized in about 66 percent of TOP500 HPC systems utilizing 100Gbit interconnects.

With so many technical advancements on the horizon, 2017 is shaping up as a year for major changes in the HPC industry. We are excited see how researchers, developers and others will utilize the technologies to take their supercomputing systems to the next level of performance, and tackle problems which were impossible just a few years ago.

1 For more complete information about performance and benchmark results, visit www.intel.com/benchmarks
Tech giant Dell EMC has announced a new collection of high performance computing (HPC) cloud offerings, software and systems to make more HPC services available to enterprises of all sizes, optimize HPC technology innovations, and advance the HPC community.

HPC technology and Advance the HPC Community

"The global HPC market forecast exceeds $30 billion in 2016 for all product and services spending, including servers, software, storage, cloud, and other categories, with continued growth expected at 5.2 percent CAGR through 2020," said Addison Snell, CEO of Intersect360 Research, in a statement. "Bolstered by its combination with EMC, Dell will hold the number-one position in total HPC revenue share heading into 2017."

Democratizing HPC
Among the new products and services is the new HPC System for Life Sciences, which will be available with the PowerEdge C6320p Server (pictured above) by the first quarter 2017. The company said the new life sciences service accelerates results for bioinformatics centers to identify treatments in clinically relevant timeframes while protecting confidential data.

"Highly parallelized computing plays an important role in high performance computing," said Ed Turkel, HPC Strategist at Dell EMC, in the statement. "Compared to serial computing, parallel computing is much better suited for modeling, simulating and understanding complex, real world phenomena. In many cases, serial programs 'waste' potential computing power." The PowerEdge C6320p Server is specifically designed to address this parallel processing environment to drive improved performance and faster big data analysis, Turkel said.

The company also said that it will begin offering new cloud bursting services from Cycle Computing to enable cloud orchestration and management between some of the largest public cloud services, including Azure and AWS. Dell said the service allows customers to more efficiently utilize their on-premises systems while providing access to the resources of the public cloud for HPC needs.

The company will also offer customers the Intel HPC Orchestrator later this quarter to help simplify the installation, management and ongoing maintenance of high-performance computing systems. HPC Orchestrator, which is based on the OpenHPC open source project, can help accelerate enterprises' installations and management.

Optimizing the HPC Porftolio

Dell EMC has been increasingly placing its bets on HPC services, unveiling a portfolio of several new HPC technologies earlier this month. For example, the company introduced its PowerEdge C4130 and R730 servers designed to boost throughput and improve cost savings for HPC and hyperscale data centers to support more deep learning applications and artificial intelligence techniques in technological and scientific fields such as DNA sequencing.

"Dell EMC is uniquely capable of breaking through the barriers of data-centric HPC and navigating new and varied workloads that are converging with big data and cloud," said Jim Ganthier, senior vice president, Validated Solutions and HPC Organization, Dell EMC, in the statement. "We are collaborating with the HPC community, including our customers, to advance and optimize HPC innovations while making these capabilities easily accessible and deployable for organizations and businesses of all sizes."
When the movie The Terminator was released in 1984, the notion of computers becoming self-aware seemed so futuristic that it was almost difficult to fathom. But just 22 years later, computers are rapidly gaining the ability to autonomously learn, predict, and adapt through the analysis of massive datasets. And luckily for us, the result is not a nuclear holocaust as the movie predicted, but new levels of data-driven innovation and opportunities for competitive advantage for a variety of enterprises and industries.
HPC Core Technologies of Deep Learning
Artificial intelligence (AI) continues to play an expanding role in the future of high-performance computing (HPC). As machines increasingly become able to learn and even reason in ways similar to humans, we’re getting closer to solving the tremendously complex social problems that have always been beyond the realm of compute. Deep learning, a branch of machine learning, uses multi-layer artificial neural networks and data-intensive training techniques to refine algorithms as they are exposed to more data. This process emulates the decision-making abilities of the human brain, which until recently was the only network that could learn and adapt based on prior experiences.

Deep learning networks have grown so sophisticated they’ve begun to deliver even better performance than traditional machine learning approaches. One advantage of deep learning is that there is little need to "train" the system and define features that might be useful for modeling and prediction. With only basic labeling, machines can now learn these features independently as more data is introduced to the model. Deep learning has even begun to surpass the capabilities and speed of the human brain in many areas, including image, speech, or text classification, natural language processing, and pattern recognition.

HPC hardware platforms of Deep Learning

The core technologies required for deep learning are very similar to those necessary for data-intensive computing and HPC applications. Here are a few technologies that are well-positioned to support deep learning networks.

Multi-core processors:
Deep learning applications require substantial amounts of processing power, and a critical element to the success and usability of deep learning comes with the ability to reduce execution times. Multi-core processor architectures currently dominate the TOP500 list of the most powerful supercomputers available today, with 91% based on Intel processors. Multiple cores can run numerous instructions at the same time, increasing the overall processing speed for compute-intensive programs like deep learning, while reducing power requirements, increasing performance, and allowing for fault tolerance.

The Intel® Xeon Phi™ Processor, which features a whopping 72 cores, is geared specifically for high-level HPC and deep learning. These many-core processors can help data scientists significantly reduce training times and run a wider variety of workloads, something that is critical to the computing requirements of deep neural networks.

Software frameworks and toolkits:
There are various frameworks, libraries, and tools available today to help software developers train and deploy deep learning networks, such as Caffe, Theano, Torch, and the HPE Cognitive Computing Toolkit. Many of these tools are built as resources for those new to deep learning systems, and aim to make deep neural networks available to those that might be outside of the machine learning community. These tools can help data scientists significantly reduce model training times and accelerate time to value for their new deep learning applications.

Deep learning hardware platforms:
Not every server can efficiently handle the compute-intensive nature of deep learning environments. Hardware platforms that are purpose-built to handle these requirements will offer the highest levels of performance and efficiency. New HPE Apollo systems contain a high ratio of GPUs to CPUs in a dense 4U form factor, which enables scientists to run deep learning algorithms faster and more efficiently while controlling costs.

Enabling technologies for deep learning is ushering in a new era of cognitive computing that promises to help us solve the world’s greatest challenges with more efficiency and speed than ever before. As these technologies become faster, more available, and easier to implement, deep learning technologies will secure their place in real-world applications – not in science fiction.
The CloudLightning Project in Europe has published preliminary results from a survey on Barriers to Using HPC in the Cloud.

cloud computing for HPC

"Cloud computing is transforming the utilization and efficiency of IT infrastructures across all sectors. Historically, cloud computing has not been used for high performance computing (HPC) to the same degree as other use cases for a number of reasons. This executive briefing is a preliminary report of a larger study on demand-side barriers and drivers of cloud computing adoption for HPC. A more comprehensive report and analysis will be published later in 2016. From June to August 2016, the CloudLightning project surveyed over 170 HPC discrete end users worldwide in the academic, commercial and government sectors on their HPC use, perceived drivers and barriers to using cloud computing, and uses of cloud computing for HPC."

cloud computing for HPC workloads

As shown in Figure 2, trust in cloud computing would appear to be a significant barrier to adopting cloud computing for HPC workloads. Data management concerns dominate the responses. This is not surprising given the large number of bio-science and university and academic respondents within the sample. The main technical barriers relate to communication speeds. This reflects a perceived lack of cloud infrastructure capable of meeting the communications and I/O requirements of high-end technical computing. Government policy is again ranked low it would seem it is neither a driver nor a barrier. Unsurprisingly availability and capital expenditure are not barriers reflecting their positive impact on adoption.

According to the report, there is unlikely to be a full shift of high performance computing workloads to the cloud in the short term however there is evidence of demand to meet the capacity limitations of internal infrastructures including use cases for testing the viability of the cloud or specific software for various use cases. This is consistent with previous research.

"Funded by the European Commission’s Horizon 2020 Program for Research and Innovation, CloudLightning brings together eight project partners from five countries across Europe. The project proposes to create a new way of provisioning heterogeneous cloud resources to deliver services, specified by the user, using a bespoke service description language. Our goal is to address energy inefficiencies particularly in the use of resources and consequently to deliver savings to the cloud provider and the cloud consumer in terms of reduced power consumption and improved service delivery, with hyperscale systems particularly in mind."




TAIPEI, Taiwan, Sept. 21 — TYAN, an industry-leading server platform design manufacturer and subsidiary of MiTAC Computing Technology Corporation, announces support and availability of the NVIDIA Tesla P100, P40 and P4 GPU accelerators with the new NVIDIA Pascal architecture. Incorporating NVIDIA’s state-of-the-art technologies allows TYAN to offer the exceptional performance and data-intensive applications features to HPC users.

HPC Platforms Add Support for NVIDIA

“Real-time, intelligent applications are transforming our world, thus our customers need an efficient compute platform to deliver responsive and cost-effective AI,” said Danny Hsu, Vice President of MiTAC Computing Technology Corporation’s TYAN Business Unit. “TYAN is pleased to work with NVIDIA to market FT77C-B7079 and TA80-B7071 servers with P100, P40 and P4 to market. The TYAN NVIDIA-based server platforms allow hyper-scale customers to deploy accurate, responsive AI solutions, and to reduce inference latency up to 45x. The high throughput and best in class efficiency of Pascal GPUs make it possible to process exploding volumes of data to offer cost effective, accurate AI applications.”

“The NVIDIA Pascal architecture is the computing engine for modern data centers. Powered by Pascal, Tesla GPUs offer massive leaps in performance and efficiency required by the ever increasing demand of AI applications,” said Roy Kim, Tesla Product Lead at NVIDIA. “We’re partnering with TYAN to deliver the accelerated solutions customers need to deploy HPC applications and AI services.”

TYAN HPC platforms with support for NVIDIA Tesla P100, P40, P4

4U/8 GPGPU FT77C-B7079 – Support up to 2x Intel Xeon E5-2600 v3/v4 (Broadwell-EP) processors, 24x DDR4 DIMM slots, 1x PCI-E x8 mezzanine slot for high-speed I/O option, 10x 3.5″/2.5″ hot-swap SATA 6Gb/s HDDs/SSDs, dual-port 10GbE/GbE LOM, and (2+1) 3,200W redundant power supplies with 80-Plus Platinum rated.

2U/4 GPGPU TA80-B7071 – Support up to 2x Intel Xeon E5-2600 v3/v4 (Broadwell-EP) processors, 16x DDR4 DIMM slots, 1x PCI-E x8 slot for high-speed I/O option, 8x 2.5″ hot-swap SAS or SATA 6Gb/s plus 2x 2.5″ internal SATA 6Gb/s HDDs/SSDs, dual-port 10GbE/GbE LOM, and (1+1) 1,600W redundant power supplies with 80-Plus Platinum rated.

About TYAN
TYAN, a leading server brand of MiTAC Computing Technology Corporation under the MiTAC Holdings Corporation (TSE:3706), designs, manufactures and markets advanced x86 and x86-64 server/workstation board and system products. The products are sold to OEMs, VARs, System Integrators and Resellers worldwide for a wide range of applications. TYAN enable customers to be technology leaders by providing scalable, highly-integrated and reliable products such as appliances for cloud service providers (CSP) and high-performance computing and server/workstation used in CAD, DCC, E&P and HPC markets. For more information, visit MiTAC Holdings Corporation’s website at http://www.mic-holdings.com  or TYAN’s website at http://www.tyan.com
Welcome to ARM Technology
Powered by Blogger.

Latest News

Newsletter

Subscribe Our Newsletter

Enter your email address below to subscribe to our newsletter.

- Copyright © ARM Tech -Robotic Notes- Powered by Blogger - Designed by HPC Appliances -