Open Telekom Cloud for Business Customers

NVIDIA A100 GPU – computing power and AI from the cloud

by Andreas Walz, Product Manager at T-Systems
Bright blue circuit board with GPU lettering in the center
A100 GPUs are suitable for many high-performance requirements – and can be used cost-effectively from the cloud.
 

In this article you can read about,

  • what characterizes NVIDIA's A100 graphics processing unit (GPU),
  • which application scenarios are possible for companies and
  • how you can use the computing power flexibly as required in the cloud.


What was once a favorite for gamers has long been used by researchers, engineers and companies. They all appreciate the enormous computing power of modern graphics processing units (GPUs). NVIDIA is currently at the forefront of this development with the A100 GPU, which enables a wide range of application scenarios.

Boom in the graphics processor market

The market for graphics processing units (GPUs) has experienced impressive momentum in recent years. The performance of graphics cards has increased dramatically and has created impressive experiences for PC users with graphics-intensive applications. But through their architecture, GPUs also bring the ability to serve high-performance requirements. The splitting of large workloads and concurrent processing are an important enabler for high performance computing in science and research. Last but not least, graphics cards also play a key role in the training and inference of artificial intelligence (AI). GPU manufacturers have designed specific graphics cards for use in data centers for such high-end purposes.

NVIDIA A100 – the GPU standard in data centers

It is therefore no coincidence that NVIDIA positions itself as the “world's leading provider of AI computing”. With its products, NVIDIA is constantly setting new benchmarks for GPU performance. The current price/performance benchmark for AI training is the A100 GPU.

Even competitors such as the MI250 from MosaicML or chips manufactured by AMD cannot currently hold a candle to the A100. The A100 GPU is based on the Ampere architecture and has around 20 times the computing power of the previous generation. In combination with the highest memory bandwidth in the world (over two terabytes per second), it delivers unprecedented computing power.

Universal, but demanding

The A100 is not just designed for one specific use case: it trains AI models, calculates complex simulations or analyzes large amounts of data. But the true added value of the A100 lies in its flexibility and ability to handle multiple workloads simultaneously. Multi-instance GPU technology makes it possible to divide the A100 into up to seven independent instances, allowing several small jobs to be handled simultaneously.

Technical data of the A100

  • Number of cores: 6912 CUDA cores
  • Memory capacity: 80GB HBM2e
  • Energy efficiency: 19.5 TeraFLOPS per watt

Possible applications for companies

The A100 GPUs support all modern high-performance requirements for simulation, modeling and AI in various industries. In many cases, they even make deployment scenarios possible or shorten their cycle – for example from weeks to days – thanks to the reduced computing time. This allows companies more frequent insights and better control options. Some examples:

  • Energy:
    Smart grids are becoming a reality thanks to the A100: energy consumption forecasts and real-time analysis of grid performance are becoming faster and more accurate. The real-time analysis of data from wind turbines optimizes their operation, but the analysis of seismic data to identify new energy deposits is also accelerated.
  • Production and maintenance:
    In predictive maintenance concepts, forecasting algorithms evaluate the condition of machines and identify maintenance requirements at an early stage. With the A100 GPU, faster and more detailed evaluations are also possible here, enabling optimal production planning and minimizing downtimes.
  • Health:
    The A100 enables faster training of deep learning algorithms. The use (inference) of deep learning services is also accelerated: doctors and researchers receive faster analysis of medical data for studies and prognoses. Individual diagnostics also benefit from the increase in speed.

GPU power from the cloud: cost-efficient and accessible to all

It's no secret that quality comes at a price. Buying an A100 is an expensive undertaking: List prices are around €20,000 – if you can get one.

The cloud offers an elegant way to use an A100 on demand – whenever it is needed – and thus significantly reduces or spreads the costs. On the Open Telekom Cloud, the use of an A100 as a p3.2xlarge.8 flavor with 80GB HBM2 main memory currently (October 2023) costs €3.57 per hour. That's well over 5,000 operating hours compared to a purchase. And costs are only incurred – in the cloud sense – when the GPU is used.

If a GPU is heavily used and utilized in the long term, a purchase can be considered – if you want to put in the management effort. But for users who only need high-performance resources sporadically or temporarily, using them from the cloud makes more business sense. In addition, the GPU is available at all times and the “delivery time” is significantly shorter.

At the same time, this demonstrates the “democratic principle” of the cloud: it also gives smaller companies access to high-end computing capacities. A comprehensive analysis over the weekend costs just under €200 – a hundredth of the purchase price and an amount that is affordable even for the smallest companies. Incidentally, the Open Telekom Cloud also offers discounted reserved packages for long-term use.

Focus on flexibility in the future too

So buy or rent? In the latter case, companies use a flexible alternative that meets their needs and avoids the high acquisition costs. Because one thing is certain: the A100 does not mark the end of GPU technology – competitors are not sleeping and are bringing dynamism to the market. Wherever GPU performance will develop in the future: the cloud gives companies access to the latest GPU generations – flexibly and in line with their needs.


This content might also interest you
 

Lila leuchtende und nebeneinander ausgerichtete Würfel mit einem hochstehender Würfel mit türkis-leuchtender Aufschrift AI

Artificial Intelligence (AI)

Artificial intelligence systems or neural networks need powerful computing resources that fit the respective model, as well as access to prepared data.
 

 
GPU server in a hall

GPU servers: Key questions and answers for companies

Graphics processing units (GPUs) enable the implementation of many modern IT applications with high-performance requirements. With the A100 GPU from the cloud, companies can use such high-end resources from the cloud cost-effectively and as required.

 
Graphic with cogwheels on a blue background

ModelArts (MA)

With ModelArts, the Open Telekom Cloud offers a modern and simple end-to-end development platform for artificial intelligence (AI) that enables the training and provision of models.

  • Communities

    The Open Telekom Cloud Community

    This is where users, developers and product owners meet to help each other, share knowledge and discuss.

    Discover now

  • Telefon

    Free expert hotline

    Our certified cloud experts provide you with personal service free of charge.

     0800 3304477 (from Germany)

     
    +800 33044770 (from abroad)

     
    24 hours a day, seven days a week

  • E-Mail

    Our customer service is available free of charge via E-Mail

    Write an E-Mail