Completed
- Inference with the Merged Model
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Deploy LLM to Production on Single GPU - REST API for Falcon 7B with QLoRA on Inference Endpoints
Automatically move to the next video in the Classroom when playback concludes
- 1 - Introduction
- 2 - Text Tutorial on MLExpert.io
- 3 - Google Colab Setup
- 4 - Merge QLoRA adapter with Falcon 7B
- 5 - Push Model to HuggingFace Hub
- 6 - Inference with the Merged Model
- 7 - HuggingFace Inference Endpoints with Custom Handler
- 8 - Create Endpoint for the Deployment
- 9 - Test the Rest API
- 10 - Conclusion