🚀 Master the language of AI with our brand new course: "Prompt Engineering for Everyone" Learn more

Offered By: IBM

Easy Speech-to-Text with Python

This project explores the multilingual automatic speech recognition (ASR) system and the architecture of signal processing using Python. Today, ASR systems are available through multiple sources, including IBM Watson® Speech to Text, and some publicly available systems through Open AI.

Continue reading

Guided Project

Artificial Intelligence

557 Enrolled
3.9
(35 Reviews)

At a Glance

This project explores the multilingual automatic speech recognition (ASR) system and the architecture of signal processing using Python. Today, ASR systems are available through multiple sources, including IBM Watson® Speech to Text, and some publicly available systems through Open AI.

Why you should do this guided project?

Let’s say you are a podcast creator, and you want to transcribe your podcast so that it can be translated into multiple languages or so that hearing-impaired people can read your content. Additionally, let’s say you want to improve the discovery of your podcasts through search engine optimization (SEO). Transcribing your podcast will enable search engines to index the text, making it easier to find it.

The purpose of this guided project is to introduce you to the ASR (automatic speech recognition) system, to help you understand how the signal processing works. The project also includes architecture of the transformer model behind ASR, and some examples of how to easily recognize, transcribe, and translate audio and video files using a publicly available ASR tool.


A look at the project ahead

After completing this project, you will be able to:
  • Understand how signal processing works.
  • Load an audio file and detect the spoken language.
  • Transcribe and translate an audio or YouTube file.
 

Prerequisites 

You just need a web browser! Basic Python programming knowledge is recommended but not required.
Everything else is provided to you via the IBM Skills Network Labs environment, where you will have access to the Cloud IDE and Python runtimes that we offer as part of the IBM Skills Network Labs environment. The IBM Skills Network Labs environment comes with many things pre-installed (e.g., Docker) to save them the hassle of setting everything up. Also, note that this platform works best with current versions of Chrome, Edge, Firefox, Internet Explorer, or Safari.

Estimated Effort

Add Pytroch

Level

Intermediate

Skills You Will Learn

Data Analysis, Data Science, Embeddable AI, Machine Learning, Python, PyTorch

Language

English

Course Code

GPXX0EPMEN

Tell Your Friends!

Saved this page to your clipboard!

Sign up to our newsletter

Stay connected with the latest industry news and knowledge!