Running an Open Source LLM - Deployment and Cost Considerations

Overview

Explore the process of running an open-source Large Language Model (LLM) in this conference talk from Conf42 LLMs 2024. Gain insights into product envisioning, basic LLM overview, and utilizing Hugging Face for model hosting. Learn about deployment infrastructure on Google Cloud, GPU requirements, and Kubernetes implementation. Discover experimentation results, challenges with open LLMs, cost considerations, and key learnings from the speaker's experience. Understand why there's no one-to-one switch between different LLMs and how to approach the implementation process.

Syllabus

intro
preamble
the product
the product we envisioned
basic overview
hugging face
hosting llm
based on google cloud
gpu requirements
deployment infrastructure
kubernetes
experimentation and results
open llms
no one-to-one switch
how much is this going to cost?
learnings
thank you