Build voice-enabled AI assistants using Azure OpenAI's Realtime API. Create a multi-agent system for customer service applications.
| Module | Focus | Documentation |
|---|---|---|
| 1. Function Calling | Azure OpenAI Realtime API Integration | Guide |
| 2. Multi-Agent System | Customer Service Implementation | Guide |
| 3. Voice RAG | Voice-Optimized Document Retrieval | Guide |
- Install & Run Docker
- Run devcontainer using VSCode
- When devcontainer has setup, run
pip install -r requirements.txt - Setup an Azure AI Foundry Hub + Project using the Azure Portal | Guide
- Create a deployment of either
gpt-4o-realtime-previeworgpt-4o-mini-realtime-preview| Guide - Update the .env with the API Key of GPT4o-realtime-audio model. You can find the key in Azure AI Foundry portal under the deployment
- Make a copy of the
.envfile and put it in the respective module folder - Move to respective modules to further run/work on the workshop modules.
- Navigate to Azure and open your Azure AI Foundry resource AI Foundry
- Go to "Models + endpoints"
- Choose "Service Endpoints"
- Pick your AI resource, in this scenario:
Azure AI Speech - Copy the
Resource endpointandPrimary Key
Real-time Communication Fundamentals | Guide
The following SDKs and libraries can be used to integrate with the gpt-4o-realtime-api (preview) on Azure.
| SDK/Library | Description |
|---|---|
openai-python |
The official Python library for the (Azure) OpenAI API |
openai-dotnet |
The official .NET library for the (Azure) OpenAI API |
openai-java |
The official Java library for the (Azure) OpenAI API |
| Accelerator | Description |
|---|---|
| VoiceRAG (aisearch-openai-rag-audio) | A simple example implementation of the VoiceRAG pattern to power interactive voice generative AI experiences using RAG with Azure AI Search and Azure OpenAI's gpt-4o-realtime-preview model. |
| On The Road CoPilot | A minimal speech-to-structured output app built with Azure OpenAI Realtime API. |
Contributions welcome via pull requests.