Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Azure OpenAI Deployment Types and Resiliency - Understanding Models, Capacity, and High Availability

John Savill's Technical Training via YouTube

Overview

Learn about Azure OpenAI deployment architectures and resilience strategies in this comprehensive technical video. Explore the stateless nature of generative APIs, regional resource considerations, and different deployment types including standard and global options. Master capacity management through pools, quotas, and intelligent routing while understanding network versus inference latency impacts. Discover data residency requirements, availability configurations, and application integration approaches including API Management. Examine pricing models covering pay-as-you-go features, Provisioned Throughput Units (PTU), and Azure reservations. Gain practical knowledge about prompt caching impacts and batch service capabilities to build robust and scalable Azure OpenAI solutions.

Syllabus

- Introduction
- Generative API is stateless
- Regional Azure OpenAI resource
- Capacity pools
- Responsible AI
- Model deployment types
- Standard
- Global
- Network vs inference latency
- Intelligent routing
- Quota vs available capacity
- Data zone and data residency
- Availability benefits?
- Resource is regional
- Multiple regional resources
- Enabling in the application
- API Management
- Prompt caching impact
- Provisioned service
- PayGo features
- PTU features
- Azure reservations
- Batch service
- Summary
- Close

Taught by

John Savill's Technical Training

Reviews

Start your review of Azure OpenAI Deployment Types and Resiliency - Understanding Models, Capacity, and High Availability

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.