Learn how you can leverage modern AI systems that utilize multimodality.
Overview
Syllabus
Introduction
- GenAI with multimodal prompts
- What is multimodality?
- Visual modality
- Textual and auditory modality
- GPT-4 and 4o
- Text to image in GPT-4
- GPT-4 API with various input types
- Challenge: Drawing to code
- Solution: Drawing to code
- What is Gemini?
- Images in Gemini
- Gemini video inputs
- Challenge: Video narration
- Solution: Video narration
- Audio in generative AI
- Prompt and audio
- Generating music
- Challenge: Soundtrack creation
- Solution: Soundtrack creation
- Next steps
Taught by
Ronnie Sheer