- the purposes for which high-performance computing (HPC) from the public cloud is suitable,
- the prerequisites that must be fulfilled and
- what companies should pay attention to.
Whether load simulations in mechanical engineering, gene analyses in medicine or machine learning in the field of artificial intelligence: highly complex processes require extreme amounts of computing power. While only a few years ago such calculations were only possible with expensive hardware from supercomputers, companies can now book the corresponding capacities according to demand from the public cloud – for example from the Open Telekom Cloud. This enables companies to use highly specialized and high-performance hardware exactly as needed, so that they can process huge amounts of data in the shortest possible time and solve complex problems much faster than before. In addition, they have no need to set up and operate their own expensive hardware and thus save on high fixed costs.
But before companies use high-performance computing (HPC) from the public cloud, a number of important questions arise:
- How does HPC differ from traditional public cloud resources?
- Which scenarios is HPC useful for – and which are not?
- How must the network connection be dimensioned?
- Which data should be in the cloud and which should not?
- Where is the best value for money?
- Which software and which certificates are indispensable?
- And what role do security and data protection play?
How does HPC differ from traditional public cloud resources?
HPC capacities from the cloud are designed to handle highly complex, extremely extensive and very specific problems in the shortest possible time. For this purpose, cloud providers have specially tailored computing resources in their portfolio. For example, GPU flavors from the Open Telekom Cloud’s Elastic Cloud Server range: They offer graphics cards with very powerful graphics processing units (GPUs) that are not only particularly suitable for operating graphics applications due to their processor architecture compared to conventional central processing units (CPUs). They are also suitable for problems with a large number of uniform processes that need to be processed in parallel in a short time. GPU flavors are therefore ideal for applications in the areas of machine learning or artificial intelligence.
In addition, there are virtual machines in the “high performance” category in the Open Telekom Cloud that are used for complex problems, which GPUs are not suitable for, due to their special architecture. They are intended, for example, for high-performance scenarios such as complex simulations. These include virtual experiments with chemical reactions, simulations of air flows or crash tests. What's special here is that the booked CPUs are reserved exclusively for the respective company. This guarantees full performance over the entire period of use.
Another possibility is so-called field-programmable gate arrays (FPGAs), which can also be booked from the cloud as required. The advantage here is that the freely programmable hardware cards can be individually adapted to each process and are therefore more flexible. As they run parallel to the remaining peripherals of a virtual machine, they act like a turbo that significantly accelerates applications and processes without putting any load on the server's CPU.
Which scenarios is HPC useful for – and which is it not?
HPC resources from the cloud are often used in a so-called bursting scenario: Companies use their own IT capacities until they are fully utilized. For everything else, they use highly specialized capacities from the public cloud: resources that their own IT department cannot provide and workloads that require highly specialized IT resources are outsourced to the cloud as needed. Companies pay for costly HPC resources only as long as they need them. This method, in which the public cloud is basically used as an overflow basin or as a supplement to one's own IT resources in a hybrid cloud model, is called "cloud bursting."
However, in order to profit efficiently from advantages such as demand-oriented use, companies should use these very purposefully and according to demand. After all, the costs for high-performance computing resources from the cloud are correspondingly higher compared to universally deployable virtual machines; moreover, depending on the application, in-depth expert knowledge is required. Ultimately, there are always three aspects that determine when companies should use specific HPC technology and when they should use traditional (cloud) resources:
- The quantity of data to be processed.
- The time available.
- The complexity of the task.
"Unfortunately, there is no formula to calculate the effectiveness of high- performance computing compared to traditional IT resources – this varies depending on the deployment scenario," says Max Guhl of T-Systems. "It is therefore best to seek the advice of experts from your cloud provider, who can make a recommendation based on their experience. And if necessary, they can help get the required cloud resources up and running right away."
An example: HPC is often used when time is of the essence. For example, if a process cannot be continued in the company before an analysis result is available.
Another example: medicine. High-performance computing resources are required if artificial intelligence is to provide information as quickly as possible, based on thousands of MRI images, to help determine if a patient needs immediate surgery or not. Or in so-called personalized medicine: With the help of gene sequencing and analysis, doctors can now prescribe individual, precise medications on the basis of patient gene information, which are considerably more effective than standard dosages. Here, too, very high-performing resources are required.“
How should the network connection be dimensioned?
Many companies connect their sites to their own data centers with a fast and secure wide-area network (WAN). But in order to benefit from demand-oriented HPC capacities from the public cloud, the company's own WAN must be connected to the public cloud with the fastest possible connection. But how should the network connection be dimensioned? "That depends on the intended use," says Max Guhl. To find out, companies should first work out the amount of data that has to go into the cloud. Depending on the scenario, this can be much smaller than some people think. Guhl: "An example from chemistry: For the simulation of new materials, only about 100 to 200 MB of data have to be transferred to the cloud. The result of the calculations is up to 100 gigabytes. However, only 10 GB of this data has to be returned to the company."
The situation is quite different, for example, in the search for oil and gas deposits: Researchers use seismic methods to calculate images of the subsurface. This results in data sets that can reach hundreds of gigabytes or even terabytes. Transferring data of this size to the cloud can take a long time: For example, it takes more than 22 hours to transfer a terabyte of data to the cloud at a speed of 100 Mbit/s. When speed is of the essence, a direct connection such as Direct Connect and Private Link Access Service (PLAS) is indispensable. This enables transfer rates of up to 10 Gbit/s into the Open Telekom Cloud. In this way, the upload of one terabyte of data is reduced from 22 hours to just 13 minutes.
Which data should go to the cloud – and which should not?
But often it is not even necessary to transfer analysis data completely into the cloud. This data can be pre-processed where it is generated by edge computing units. "An edge device on a gas turbine, for example, calculates the audio signal of a microphone in real time in the frequency spectrum in order to transport only these small amounts of data to the backend," describes Crisp Research in a recent article on edge computing.
And here too the question arises: How quickly are results needed? Does the data delivered to the cloud have to be analyzed in (almost) real time? If not, the data can be transferred successively without any problems. The data is then first collected in the cloud and then quickly evaluated in regular batch runs using high-performance resources. Companies then pay for the HPC resources as required – for example per batch run – and benefit at the same time from high-performance resources and the pay-as-you-go principle, which only public cloud computing offers. After all, costs for deciding for or against HPC from the public cloud play an important, if not the central, role for companies.
Where is the best value for money?
Although CPU prices per hour have fallen significantly in recent years, the CPU price alone should not be the selection criterion; after all, not every CPU performs the same. Rather, the question of the price/performance ratio is relevant. Here there are sometimes considerable differences. A current benchmark analysis by Cloud Spectator from spring 2019 shows that the Open Telekom Cloud is clearly beating its competitors when it comes to high-performance computing. The cloud analysts compared the high-performance flavors of Amazon Web Services, Microsoft Azure, Google Cloud Engine and the Open Telekom Cloud. The result: The HPC VMs of the Open Telekom Cloud ranked far ahead in terms of both relative CPU performance and storage performance, with the best relative read and write speed.
Which software and which certificates are indispensable?
Among other things, HPC applications require corresponding HPC software and middleware. Companies that already use HPC resources on-premises can often continue to use software that they already use. The prerequisite is that their software not only works on-premises, but is also supported by the cloud. In addition, companies should ensure that the cloud provider with whom they want to host their HPC workloads also takes over the operation of their software if required.
A frequently used HPC solution is the one offered by Altair. The platform is like a command center for HPC administrators to deploy, manage, monitor and optimize HPC appliances in any cloud, public, private or hybrid. For example, Altair applications are compatible with the Open Telekom Cloud's HPC infrastructure. Altair also supports cloud deployments of HPC clusters through its PBS Works product line (Portable Batch System). By running HPC workloads in the cloud, customers can extend HPC clusters locally or carry out their projects entirely in the cloud.
The Open Telekom Cloud also supports Moab Cloud/NODUS cloud bursting. The highly flexible and expandable solution makes it possible to relocate processes to the cloud as required. All the necessary workload resources are automatically provided on demand and shut down again when they are no longer needed. Other software used to operate HPC resources and supported by the Open Telekom Cloud include UNIVA, SGE, IntelMPI and SpectrumMPI, as well as open source services such as OpenMPI and SLURM.
In addition to the right software, however, companies should also pay attention to important certificates. For example, some industries require suppliers and service providers to have special certifications when using cloud resources. Companies in the automotive industry, for example, are not allowed to use IT capacities without a TISAX 3 (Trusted Information Security Assessment Exchange) certificate – partly because this certificate proves particularly high standards with regard to IT security. TISAX is a mutually recognized auditing standard for information security. A large number of automobile manufacturers and suppliers to the German automotive industry require business partners to have an existing TISAX certification. The Open Telekom Cloud is the only large public cloud provider currently offering TISAX 3.
What role do security and data protection play?
Of course, the software used should meet all the necessary standards to ensure the highest level of IT security and data protection. After all, data that is processed via HPC can also be business-critical or personal. Take the automotive industry, for example: simulation data in connection with the development of new designs or components are subject to strict internal company guidelines. Or in medicine: the analysis of gene sequences is person-related. Even IP addresses fall into this category under certain conditions and are therefore protected under the European Union’s General Data Protection Regulation (GDPR). For example, when patient data is stored during the evaluation of medical images in order to avoid confusion. Companies should therefore look for a cloud provider that can prove that the data centers it uses meet the highest requirements in terms of data security and data protection.
The Open Telekom Cloud, for example, has undergone certification in line with the requirements of the TCDP 1.0 (Trusted Cloud Data Protection Profile). This attests that the Open Telekom Cloud is currently one of the few cloud offerings on the market to have a legally compliant data protection certification for defined cloud services.
In addition, for data protection reasons the location of the cloud provider is still relevant for many companies. Non-European cloud providers have long had data centers in Europe and are therefore obliged to comply with European data protection laws. However, there is uncertainty as to whether these cloud providers are legally obliged to allow intelligence services to access their data – for example if the US authorities make enquiries. To avoid unpleasant surprises, some companies therefore prefer to entrust their data to European providers such as Deutsche Telekom, which, for example, operates its own data centers in Saxony-Anhalt with the Open Telekom Cloud.
In many cases, it makes sense for companies to not only use their own computing resources on-premises for high-performance computing, but also to use HPC resources from the public cloud. This can help companies to solve extremely demanding tasks in the shortest possible time and thus gain a competitive advantage – at a manageable cost. But for targeted use, they must first define,
- which goal they would like to achieve with HPC,
- which and how much data is required for this,
- whether their network connection is sufficient and
- what kind of cloud technology is right for that.
To find out, companies can draw on Telekom's expertise and experience. Interested companies can contact Deutsche Telekom at the following e-mail address: firstname.lastname@example.org.
Do you have questions?
We answer your questions about testing, booking and use - free of charge and individually. Try it! Hotline: 24 hours a day, 7 days a week
0800 33 04477 from Germany / 00800 44 556 600 from abroad