Top

CISA Sounds Alarm on Cybersecurity Threats Amid Russia’s Invasion Anniversary

What Role Does Big Data Have on the Deep Web?

Nvidia wants its first SaaS product to help power the metaverse

MongoDB: From jokes to juggernaut

3 Reasons Why Cloud Contact Centres Are the Next Step in Customer Success

image credit: Unsplash

Nvidia Claims Doubled Inference Performance with H100

September 11, 2023

Via: Tom's Hardware

Category:

Nvidia says its new TensorRT-LL open-source software can dramatically boost performance of large language models (LLMs) on its GPUs. According to the company, the capabilities of Nvidia’s TensorRT-LL let it boost performance of its H100 compute GPU by two times in GPT-J LLM with six billion parameters. Importantly, the software can enable this performance improvement without re-training the model.

Nvidia developed TensorRT-LLM specifically to speed up performance of LLM inference and performance graphcs provided by Nvidia indeed show a 2X speed boost for its H100 due to appropriate software optimizations. A particular standout feature of Nvidia’s TensorRT-LLM is its innovative in-flight batching technique. This method addresses the dynamic and diverse workloads of LLMs, which can vary greatly in their computational demands.

Read More on Tom's Hardware

Nvidia Claims Doubled Inference Performance with H100

Latest Publications

Intel Thunderbolt Share aims to simplify connection and resource sharing between multiple PCs

New Wi-Fi Vulnerability Enables Network Eavesdropping via Downgrade Attacks

iOS 17.5 bug resurfaces old deleted photos for some reason

Cybercriminals Exploiting Microsoft’s Quick Assist Feature in Ransomware Attacks

AMD RDNA 4 graphics cards could be imminent, as huge driver-related hint is dropped

Nvidia Claims Doubled Inference Performance with H100

Previous Article

Next Article

RELATED PUBLICATIONS

Trending

Tags

Latest Publications

Intel Thunderbolt Share aims to simplify connection and resource sharing between multiple PCs

New Wi-Fi Vulnerability Enables Network Eavesdropping via Downgrade Attacks

iOS 17.5 bug resurfaces old deleted photos for some reason

Cybercriminals Exploiting Microsoft’s Quick Assist Feature in Ransomware Attacks

AMD RDNA 4 graphics cards could be imminent, as huge driver-related hint is dropped