Skip to main content

Nvidia built a massive dual GPU to power models like ChatGPT

Nvidia’s semi-annual GPU Technology Conference (GTC) usually focuses on advancements in AI, but this year, Nvidia is responding to the massive rise of ChatGPT with a slate of new GPUs. Chief among them is the H100 NVL, which stitches two of Nvidia’s H100 GPUs together to deploy Large Language Models (LLM) like ChatGPT.

The H100 isn’t a new GPU. Nvidia announced it a year ago at GTC, sporting its Hopper architecture and promising to speed up AI inference in a variety of tasks. The new NVL model with its massive 94GB of memory is said to work best when deploying LLMs at scale, offering up to 12 times faster inference compared to last-gen’s A100.

Nvidia's H100 NVL being installed in a server.
Nvidia

These GPUs are at the heart of models like ChatGPT. Nvidia and Microsoft recently revealed that thousands of A100 GPUs were used to train ChatGPT, which is a project that’s been more than five years in the making.

Recommended Videos

The H100 NVL works by combining two H100 GPUs over Nvidia high bandwidth NVLink interconnect. This is already possible with current H100 GPUs — in fact, you can connect up to 256 H100s together through NVLink — but this dedicated unit is built for smaller deployments.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

This is a product built for businesses more than anything, so don’t expect to see the H100 NVL pop up on the shelf at your local Micro Center. However, Nvidia says enterprise customers can expect to see it around the second half of the year.

In addition to the H100 NVL, Nvidia also announced the L4 GPU, which is specifically built to power AI-generated videos. Nvidia says it’s 120 times more powerful for AI-generated videos than a CPU, and offers 99% better energy efficiency. In addition to generative AI video, Nvidia says the GPU sports video decoding and transcoding capabilities and can be leveraged for augmented reality.

Nvidia says Google Cloud is among the first to integrate the L4. Google plans on offering L4 instances to customers through its Vertex AI platform later today. Nvidia said the GPU will be available from partners later, including Lenovo, Dell, Asus, HP, Gigabyte, and HP, among others.

Jacob Roach
Lead Reporter, PC Hardware
Jacob Roach is the lead reporter for PC hardware at Digital Trends. In addition to covering the latest PC components, from…
ChatGPT’s resource demands are getting out of control
a server

It's no secret that the growth of generative AI has demanded ever increasing amounts of water and electricity, but a new study from The Washington Post and researchers from University of California, Riverside shows just how many resources OpenAI's chatbot needs in order to perform even its most basic functions.

In terms of water usage, the amount needed for ChatGPT to write a 100-word email depends on the state and the user's proximity to OpenAI's nearest data center. The less prevalent water is in a given region, and the less expensive electricity is, the more likely the data center is to rely on electrically powered air conditioning units instead. In Texas, for example, the chatbot only consumes an estimated 235 milliliters needed to generate one 100-word email. That same email drafted in Washington, on the other hand, would require 1,408 milliliters (nearly a liter and a half) per email.

Read more
How you can try OpenAI’s new o1-preview model for yourself
The openAI o1 logo

Despite months of rumored development, OpenAI's release of its Project Strawberry last week came as something of a surprise, with many analysts believing the model wouldn't be ready for weeks at least, if not later in the fall.

The new o1-preview model, and its o1-mini counterpart, are already available for use and evaluation, here's how to get access for yourself.

Read more
OpenAI’s advanced ‘Project Strawberry’ model has finally arrived
chatGPT on a phone on an encyclopedia

After months of speculation and anticipation, OpenAI has released the production version of its advanced reasoning model, Project Strawberry, which has been renamed "o1." It is joined by a "mini" version (just as GPT-4o was) that will offer faster and more responsive interactions at the expense of leveraging a larger knowledge base.

It appears that o1 offers a mixed bag of technical advancements. It's the first in OpenAI's line of reasoning models designed to use humanlike deduction to answer complex questions on subjects -- including science, coding, and math -- faster than humans can.

Read more