Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn to build an interactive Large Language Model using LLama 2 70B with Retrieval Augmented Generation (RAG) in this 52-minute coding tutorial. Master the process of integrating external knowledge from web pages to enhance AI response accuracy without relying on frameworks like LangChain or LLamaIndex. Explore hands-on implementation of web scraping using ChatGPT with scrapy and BeautifulSoap, text chunking, and embedding creation with sentence-transformers. Dive into vector space operations using hnswlib for indexing and similarity search to identify top 10 matches. Understand the implementation of cross-encoder reranking for semantic similarity and learn to feed optimized results back into LLama 2 through ICL prompt augmentation. Follow along with practical code examples sourced from Hugging Face spaces and OpenAI, gaining essential skills in building intelligence systems with pure code implementation.