PeriFlow: High-Performance Generative AI Serving Engine

Overview

Discover an innovative technical presentation from SK TECH SUMMIT 2023 exploring PeriFlow, the fastest available generative AI serving engine in the market. Learn how this groundbreaking technology reduces GPU resource requirements by 70-90% when serving generative AI models like Llama 2. Explore PeriFlow's specialized batching technology that significantly improves throughput while maintaining low latency, protected by patents in the United States and Korea. Gain insights from Dr. Kyungin Yu, who holds a Ph.D. in Computer Science from Seoul National University and specializes in developing efficient systems for AI models including LLMs. Understand how PeriFlow is delivered both as a container and cloud (SaaS) solution, making it accessible for various implementation needs. The 20-minute talk demonstrates how today's technology shapes a more convenient and secure tomorrow through the expertise shared by leading technology companies and professionals.