Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM

The Machine Learning Engineer via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.

Syllabus

LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning

Taught by

The Machine Learning Engineer

Reviews

Start your review of LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.