Using OpenAI APIs: Using Image & Audio APIs

Generative AI    |    Intermediate
  • 9 videos | 1h 12m 52s
  • Includes Assessment
  • Earns a Badge
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)
DALL-E and Whisper are OpenAI's image and audio-based model offerings. DALL-E, an image generation model, demonstrates the ability to create visually striking images based on textual prompts. Whisper represents a state-of-the-art automatic speech recognition (ASR) system. With its high accuracy in transcribing spoken words, Whisper finds utility in various applications, from voice assistants to transcription services. You will begin this course by generating images using OpenAI's DALL-E model. You will generate images using text prompts, create variations of existing images, and perform image inpainting using natural language. Then, you will work with the Whisper model, which caters to speech transcription and translation. You will transcribe and translate audio in different languages and accents, and you will evaluate the performance of these models.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Generate images using dall-e
    Create image variations and perform inpainting
    Transcribe clips of audio
    Perform translation and text-to-speech conversion
  • Evaluate audio transcription
    Set up the whisper model locally
    Interpret images with the chat application programming interface (api)
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 1m 27s
    In this video, we will discover the key concepts covered in this course. FREE ACCESS
  • 10m 51s
    Learn how to generate images using DALL-E. FREE ACCESS
  • Locked
    3.  Working with Image Variations and Inpainting
    11m 24s
    In this video, find out how to create image variations and perform inpainting. FREE ACCESS
  • Locked
    4.  Performing Audio Transcription
    10m 19s
    During this video, discover how to transcribe clips of audio. FREE ACCESS
  • Locked
    5.  Performing Translation and Text-to-Speech Conversion
    6m 43s
    In this video, you will learn how to perform translation and text-to-speech conversion. FREE ACCESS
  • Locked
    6.  Evaluating Transcribed Audio
    10m 37s
    Find out how to evaluate audio transcription. FREE ACCESS
  • Locked
    7.  Installing and Using the Whisper Model Locally
    9m 42s
    Discover how to set up the Whisper model locally. FREE ACCESS
  • Locked
    8.  Using Chat Completions to Interpret Images
    9m 53s
    Learn how to interpret images with the chat application programming interface (API). FREE ACCESS
  • Locked
    9.  Course Summary
    1m 57s
    In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE