Learn to build an interactive Large Language Model using LLama 2 70B with Retrieval Augmented Generation (RAG) in this 52-minute coding tutorial. Master the process of integrating external knowledge from web pages to enhance AI response accuracy without relying on frameworks like LangChain or LLamaIndex. Explore hands-on implementation of web scraping using ChatGPT with scrapy and BeautifulSoap, text chunking, and embedding creation with sentence-transformers. Dive into vector space operations using hnswlib for indexing and similarity search to identify top 10 matches. Understand the implementation of cross-encoder reranking for semantic similarity and learn to feed optimized results back into LLama 2 through ICL prompt augmentation. Follow along with practical code examples sourced from Hugging Face spaces and OpenAI, gaining essential skills in building intelligence systems with pure code implementation.
LLama 2 and PEFT Documentation - Building Interactive LLM with Retrieval Augmented Generation
Discover AI via YouTube
Overview
Syllabus
LLama 2 + PEFT Docs: CODE interactive LLM w/ RAG
Taught by
Discover AI