From the course: OpenAI API and MCP Development
Audio API: Translate audio sample
From the course: OpenAI API and MCP Development
Audio API: Translate audio sample
Let's continue with the demonstration and this time we want to look at another endpoint which is translation also part of the speech-to-text audio API and some of the supported language will be available here and let's find out how to actually do it right here or to do it more simpler is going to the API reference and this is where you're going to find how to create here an API request using the translations endpoints. So the same one here. Let's go back to the project because this is where you're going to find also for your convenience you're going to have access to another function which is the speech to translation line 53. Inside you can recognize the translations endpoints and once it is completed and successful you're going to be able to read the translated text from the original audio. So let's try that. So let's go back to to the main.py file. So this is where we're gonna set it up. And it's already added to the scope of your project. So you just need to actually use it to access it. And here we want to, it's gonna be actually translated text using this speech to translation and using the same temp file path. So once you upload the file, you can access it by using the Temporary File Path Name. All right, and once it is done, we're gonna do just like we have done here so we can display the text once it is translated, but also we can read translation just to make things look nice. So I'm just gonna change the color here. So instead of having blue, I'm just gonna allow to read the translated text in purple, and we're gonna read translated text here. So let's try that. We're gonna run the app again with this command, streamlit run main.py. Okay, so let's run the same example. We're gonna use the German file. All right, and then we can submit. It's gonna start with the transcription. Once it is complete, we're gonna read below. unsere Erfahrung mit Connecticut war durchweg positiv. Okay, and we can also listen to it. Unsere Erfahrung mit Connecticut war durchweg positiv. Okay, I can make out a little bit of German, so this is very rusty, but I can tell that this is matching the translation below. And what would be nice also is to be able to actually listen. We can read, but we can also listen to the file. So once it is translated, that we can actually have the options to listen to it. in English below. So for that we need to also save this file first because we're going to need to access it so that is something that we haven't done here. So here line 62 you see that we have this self file for the original text. Let's do the same so once it is completed here we know that this is translated we can allow to save also in the translation folder Here, Translation. All right, you'll find it here. So you have a folder for the transcriptions, but also for the translations, and then a Temporary Files. And we use the Temporary Files. This time, this is to do something special. We're gonna allow to go from text to voice, text to speech. Let me show you that. Let's go back to the documentation. This time, what we wanna do is to create a speech. So we're going to take a text as an input. We can use for the language models, text to speech, dash one. So this is short for text to speech. We can use this one. And you can have access to different type of voices. Let's go to the documentations to actually listen to one example. The sun rises in the east and sets in the west. This simple fact has been observed by humans for thousands of years. The sun rises in the- And for this one, this is an example with the voice, Alois, which is actually generated with AI. And it sounds very lifelike, very human-like. You could tell that this is almost like a normal person. And you have the options with different voices. All right, you have Ash, that sounds like a male voice. Or you have Nova, that sounds like a female voice. So depending on your choice, what you'd prefer, you can choose between this type of voice. So let's see how to set it up. it is already provided here as well as a helper function. So let's go check it out. Here you have text to speech. I'm actually using this Alois voice and the model TTS one. So let's use it because what I'd like is to be able to not only read, but also listen to this translated piece of text. And once it is done, we're gonna add the audio here, which is another option with Streamlit. And for any audio, let's go back in the Utils, it's going to be accessible using this file path. So let's actually use the same, this file path. And for this one up here, so this is already done. For the audio, we're using the temp file path. The same that we're using once we upload using the file uploader. So we're always using temporary file path. Okay, so let's try that. Okay. All right, so let's run another example. And this time I'd like to listen to the French testimonial example. So that's gonna be French for the original audio. All right, so here we have the transcription so we can read the first few lines. Notre expérience avec le Connecticut n'a été que positive, and we can also listen to it. Notre expérience avec le Connecticut n'a été que positive. Nous nous sommes lancés dans le projet avec très peu de connaissances sur l'énergie solaire, et nous n'étions même... All right, so we can read a few sentences, and right below you have the translation. And I can tell that this is corresponding, that this is very close to the original text. And here we can read below that our experience with Connecticut was only positive. We started the project with very little knowledge. So this is correct. And now we can listen. Our experience with Connecticut was only positive. We started the project with very little knowledge of solar energy. And we were not even sure... Perfect. So this is working just fine. So let me just adjust a few things just to make things look nicer. A bit very similar to what we have up here. So we have the translation that goes actually first here and then the audio feature and then we can read the text. So from the audio to text. Excellent. So now we have a fully functional app which is up and running, allowing us to convert any audio sample into written text for our convenience. And on top of that, we can also use one of the supported languages, meaning that we don't have to have an English original audio. We can use any supported language, so even if it is originally in French or German or Spanish, we can then use these features to translate the file, so to transcribe, but also translate into English language.