Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

DeepLearning.AI

Open Source Models with Hugging Face

DeepLearning.AI via Coursera

Overview

The availability of models and their weights for anyone to download enables a broader range of developers to innovate and create. In this course, you’ll select open source models from Hugging Face Hub to perform NLP, audio, image and multimodal tasks using the Hugging Face transformers library. Easily package your code into a user-friendly app that you can run on the cloud using Gradio and Hugging Face Spaces. You will: 1. Use the transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions. 2. Translate between languages, summarize documents, and measure the similarity between two pieces of text, which can be used for search and retrieval. 3. Convert audio to text with Automatic Speech Recognition (ASR), and convert text to audio using Text to Speech (TTS). 4. Perform zero-shot audio classification, to classify audio without fine-tuning the model. 5. Generate an audio narration describing an image by combining object detection and text-to-speech models. 6. Identify objects or regions in an image by prompting a zero-shot image segmentation model with points to identify the object that you want to select. 7. Implement visual question answering, image search, image captioning and other multimodal tasks. 8. Share your AI app using Gradio and Hugging Face Spaces to run your applications in a user-friendly interface on the cloud or as an API. The course will provide you with the building blocks that you can combine into a pipeline to build your AI-enabled applications!

Syllabus

  • Open Source Models with Hugging Face
    • The availability of models and their weights for anyone to download enables a broader range of developers to innovate and create. In this course, you’ll select open source models from Hugging Face Hub to perform NLP, audio, image and multimodal tasks using the Hugging Face transformers library. Easily package your code into a user-friendly app that you can run on the cloud using Gradio and Hugging Face Spaces. You will: (1) Use the transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions. (2) Translate between languages, summarize documents, and measure the similarity between two pieces of text, which can be used for search and retrieval. (3) Convert audio to text with Automatic Speech Recognition (ASR), and convert text to audio using Text to Speech (TTS). (4) Perform zero-shot audio classification, to classify audio without fine-tuning the model. (5) Generate an audio narration describing an image by combining object detection and text-to-speech models. (6) Identify objects or regions in an image by prompting a zero-shot image segmentation model with points to identify the object that you want to select. (7) Implement visual question answering, image search, image captioning and other multimodal tasks. (8) Share your AI app using Gradio and Hugging Face Spaces to run your applications in a user-friendly interface on the cloud or as an API. The course will provide you with the building blocks that you can combine into a pipeline to build your AI-enabled applications!

Taught by

Younes Belkada, Marc Sun, and Maria Khalusova

Reviews

Start your review of Open Source Models with Hugging Face

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.