AI and Robotics

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

New open language models from Google accelerated by TensorRT-LLM across NVIDIA AI platforms — including local RTX AI PCs. NVIDIA, in collaboration with Google, today launched optimizations across all NVIDIA AI platforms for Gemma — Google’s state-of-the-art new lightweight 2 billion– and 7 billion-parameter open language models

Web & Cloud

Feb 21, 2024 - 19:44

Feb 21, 2024 - 20:06

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

New open language models from Google accelerated by TensorRT-LLM across NVIDIA AI platforms — including local RTX AI PCs.

NVIDIA, in collaboration with Google, today launched optimizations across all NVIDIA AI platforms for Gemma — Google’s state-of-the-art new lightweight 2 billion– and 7 billion-parameter open language models that can be run anywhere, reducing costs and speeding innovative work for domain-specific use cases.

Teams from the companies worked closely together to accelerate the performance of Gemma — built from the same research and technology used to create the Gemini models — with NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference, when running on NVIDIA GPUs in the data center, in the cloud and on PCs with NVIDIA RTX GPUs.

This allows developers to target the installed base of over 100 million NVIDIA RTX GPUs available in high-performance AI PCs globally.

Developers can also run Gemma on NVIDIA GPUs in the cloud, including on Google Cloud’s A3 instances based on the H100 Tensor Core GPU and soon, NVIDIA’s H200 Tensor Core GPUs — featuring 141GB of HBM3e memory at 4.8 terabytes per second — which Google will deploy this year.

Enterprise developers can additionally take advantage of NVIDIA’s rich ecosystem of tools — including NVIDIA AI Enterprise with the NeMo framework and TensorRT-LLM — to fine-tune Gemma and deploy the optimized model in their production application.

Learn more about how TensorRT-LLM is revving up inference for Gemma, along with additional information for developers. This includes several model checkpoints of Gemma and the FP8-quantized version of the model, all optimized with TensorRT-LLM.

Experience Gemma 2B and Gemma 7B directly from your browser on the NVIDIA AI Playground.

Gemma Coming to Chat With RTX

Adding support for Gemma soon is Chat with RTX, an NVIDIA tech demo that uses retrieval-augmented generation and TensorRT-LLM software to give users generative AI capabilities on their local, RTX-powered Windows PCs.

The Chat with RTX lets users personalize a chatbot with their own data by easily connecting local files on a PC to a large language model.

Since the model runs locally, it provides results fast, and user data stays on the device. Rather than relying on cloud-based LLM services, Chat with RTX lets users process sensitive data on a local PC without the need to share it with a third party or have an internet connection.

Tags:

February 21, 2024 Edge AI and Vision Alliance Member Briefing Presentation

Web & Cloud Need help implementing innovative technology, with tech business management, or tech support/Helpdesk? Web & Cloud is here to take the heavy load off your shoulders! We’ve been serving the global market, offering top-notch tech implementation and support services since 2003. Request an obligation-free quote from My.webandcloud.com and let's discuss your challenges.

Sponsor to Give Hope, Transform, and Uplift Lives.

	Need help implementing innovative technology, with tech support or management? You can count on us.
	24-7 Press Release - Let's distribute your Press Releases to traditional and digital media outlets. Get started!
	Reliable Website Security Solutions, built for small businesses, web professionals, and enterprise organizations.
	Paternity Lab - bringing DNA Paternity Testing closer to people. We offer accurate, affordable, and easy DNA Paternity Testing. Also at home.
	Rexing USA - exclusive cameras, car gadgets, and EV accessories with unique designs, innovative technology, and in affordable price ranges.

The Rising Wave of Blockchain Technology Adop...

HackaTRON Season 7 Launches With Google Cloud...

Skybridge Founder: Kamala Harris Open-Minded ...

Auradine Ships 3nm Teraflux Bitcoin Mining Pl...

Wazirx Details Plan to Resume Withdrawals and...

Agentic AI Leaders to Showcase Latest Advance...

NVIDIA Releases NIM Microservices to Safeguar...

How AI Is Enhancing Surgical Safety and Educa...

NVIDIA and IQVIA Build Domain-Expert Agentic ...

AI Gets Real for Retailers: 9 Out of 10 Retai...

Alleged Co-Founder of Garantex Arrested in India

Feds Link $150M Cyberheist to 2022 LastPass H...

Who is the DOGE and X Technician Branden Spikes?

Notorious Malware, Spam Host “Prospero” Moves...

U.S. Soldier Charged in AT&T Hack Searched “C...

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

New open language models from Google accelerated by TensorRT-LLM across NVIDIA AI platforms — including local RTX AI PCs.

Gemma Coming to Chat With RTX

Tags:

February 21, 2024 Edge AI and Vision Alliance Member Briefing Presentation

Seeing into the Shadows: Tackling ChromeOS Blind Spots with Dell and CrowdStrike

GFN Thursday Adds New Titles From THQ Nordic to GeForce...

Retailers Adopting AI and Cloud Computing More Aggressi...

Unleashing capacity at Heineken México with systems thi...

Change language

SPONSORED

Recommended for you

Great Opportunity You Can't Reject! (No, Seriously...

Pause and let's talk about responsible spending an...

Experts Estimate £20 Million+ Loss from Heathrow A...

Welcome to ProtoPie

Ready to turn your innovative tech business dream ...

Gold Could Surge to $40,000 per Ounce, Strategist ...

Web & Cloud - Engineering Tech for a Better Tomorrow!

Introducing: Techatty Aerospace

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

New open language models from Google accelerated by TensorRT-LLM across NVIDIA AI platforms — including local RTX AI PCs.

Gemma Coming to Chat With RTX

Tags:

Related Posts

Change language

SPONSORED

Recommended for you