Get an introduction to the architecture, process of fine tuning, deploying, and prompting in the popular open source LLaMa model.
Overview
Syllabus
Introduction
- Developing AI models using LLaMA
- Using LLaMA online
- Running LLaMA in a notebook
- Accessing LLaMA in an enterprise environment
- The LLaMA architecture
- The LLaMA tokenizer
- The LLaMA context window
- Differences between LLaMA 1 and 2
- Fine-tuning LLaMA with a few examples
- Fine-tuning LLaMA and freezing layers
- Fine-tuning with LLaMA using LoRa
- Reinforcement learning with RLHF and DPO
- Fine-tuning larger LLaMA models
- Resources required to serve LLaMA
- Quantizing LLaMA
- Using TGI for serving LLaMA
- Using VLLM for serving LLaMA
- Using DeepSpeed for serving LLaMA
- Explaining LoRA and SLoRA
- Using a vendor for serving LLaMA
- Difference between LLaMA with commercial LLMs
- Few shot learning with LLaMA
- Chain of thought with LLaMA
- Using schemas with LLaMA
- Optimizing LLaMA prompts with DSPy
- Challenge: Generating product tags
- Solution: Generating product tags
- Continue your LlaMA AI model development journey
Taught by
Denys Linkov