Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Autogen and Local LLMs Create Realistic Stable Diffusion Model Autonomously

kasukanra via YouTube

Overview

Explore an in-depth video tutorial on harnessing AI Agents and local large language models to autonomously create a realistic style model for SDXL. Dive into the technical aspects of setting up and utilizing various AI frameworks, including LLaVA, Mistral, and llama.cpp. Learn about installing and configuring Chrome and Chromedriver, working with Selenium WebDriver, and implementing Autogen code. Discover techniques for image fetching, processing, and upscaling using Topaz Gigapixel AI. Follow along as the presenter demonstrates the installation and usage of text-generation-webui, explores API interactions, and compares different prompting strategies. Gain insights into troubleshooting common issues and optimizing model performance. This comprehensive guide covers everything from initial setup to advanced implementation, providing valuable knowledge for AI enthusiasts and developers working with SDXL and large language models.

Syllabus

Introduction
Technical Design Flowchart
Installing Chrome
Chromedriver not available and how to fix it
Testing selenium webdriver
Autogen code overview
AI Agents more in-depth
Fetch image overview
Accessing page source
Gotcha with page source
Renaming low resolution link to highest resolution link
Testing the fetch_images script
Revisiting the Autogen code
Autogen in action
Checking the downloaded images
Organizing the images
Using Topaz Gigapixel AI to upscale images
Loading LLM framework overview
Installing text-generation-webui
Showing git hash for text-generation-webui
Downloading llava-v1.5-13b-GPTQ
Support LLaVA v1.5 pull request
Commit log for LLaVA v1.5
Original LLaVA v1.5-13b repository
Possible ? to load llava-v1.5-13b using --load-in-4bit flag in the readme
Downloading the model through CLI
Model Placement in text-generation-webui directory
Multimodal documentation for starting up API
Command to start the server
text-generation-webui GUI
Looking at pull request to see suggested settings
Changing presets in text-generation-webui
Initial trials in the GUI
Comparing concise and verbose prompt instruction
Testing out the text-generation-webui API
Get the IP address of windows from inside Linux
Finding the endpoint/API examples
Testing the API request
Comparing results between the API and the GUI
llava v1.5-13b responding in another language, hallucination ?
Using replicate's original llava v1.5-13b model
Bringing up concise vs. verbose prompt again
Setting up replicate API key locally
Setting up python call to replicate
Running iterate replicate code
Download llava-v1.5-7b model
Setting up llama.cpp framework
Adding models to llama.cpp
Showing llama.cpp commit hash
Starting up llama.cpp server
llama.cpp GUI
llama.cpp API code overview
llama.cpp server API documentation

Taught by

kasukanra

Reviews

Start your review of Autogen and Local LLMs Create Realistic Stable Diffusion Model Autonomously

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.