Skip to content

Commit 4ac442e

Browse files
samdentyhaydenbleasellgrammel
authored
feat(providers/revai): add transcribe (#5730) (#5807)
Co-authored-by: Hayden Bleasel <hello@haydenbleasel.com> Co-authored-by: Lars Grammel <lars.grammel@gmail.com>
1 parent 6d527aa commit 4ac442e

29 files changed

+1708
-346
lines changed

‎.changeset/fair-cups-travel.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
'@ai-sdk/revai': patch
3+
---
4+
5+
feat(providers/revai): add transcribe

‎content/docs/02-foundations/02-providers-and-models.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ The AI SDK comes with a wide range of providers that you can use to interact wit
4141
- [Groq Provider](/providers/ai-sdk-providers/groq) (`@ai-sdk/groq`)
4242
- [Perplexity Provider](/providers/ai-sdk-providers/perplexity) (`@ai-sdk/perplexity`)
4343
- [ElevenLabs Provider](/providers/ai-sdk-providers/elevenlabs) (`@ai-sdk/elevenlabs`)
44+
- [Rev.ai Provider](/providers/ai-sdk-providers/revai) (`@ai-sdk/revai`)
4445

4546
You can also use the [OpenAI Compatible provider](/providers/openai-compatible-providers) with OpenAI-compatible APIs:
4647

‎content/docs/03-ai-sdk-core/36-transcription.mdx

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,5 +157,8 @@ try {
157157
| [Azure OpenAI](/providers/ai-sdk-providers/azure#transcription-models) | `whisper-1` |
158158
| [Azure OpenAI](/providers/ai-sdk-providers/azure#transcription-models) | `gpt-4o-transcribe` |
159159
| [Azure OpenAI](/providers/ai-sdk-providers/azure#transcription-models) | `gpt-4o-mini-transcribe` |
160+
| [Rev.ai](/providers/ai-sdk-providers/revai#transcription-models) | `machine` |
161+
| [Rev.ai](/providers/ai-sdk-providers/revai#transcription-models) | `low_cost` |
162+
| [Rev.ai](/providers/ai-sdk-providers/revai#transcription-models) | `fusion` |
160163

161164
Above are a small subset of the transcription models supported by the AI SDK providers. For more, see the respective provider documentation.
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
---
2+
title: Rev.ai
3+
description: Learn how to use the Rev.ai provider for the AI SDK.
4+
---
5+
6+
# Rev.ai Provider
7+
8+
The [Rev.ai](https://www.rev.ai/) provider contains language model support for the Rev.ai transcription API.
9+
10+
## Setup
11+
12+
The Rev.ai provider is available in the `@ai-sdk/revai` module. You can install it with
13+
14+
<Tabs items={['pnpm', 'npm', 'yarn']}>
15+
<Tab>
16+
<Snippet text="pnpm add @ai-sdk/revai" dark />
17+
</Tab>
18+
<Tab>
19+
<Snippet text="npm install @ai-sdk/revai" dark />
20+
</Tab>
21+
<Tab>
22+
<Snippet text="yarn add @ai-sdk/revai" dark />
23+
</Tab>
24+
</Tabs>
25+
26+
## Provider Instance
27+
28+
You can import the default provider instance `revai` from `@ai-sdk/revai`:
29+
30+
```ts
31+
import { revai } from '@ai-sdk/revai';
32+
```
33+
34+
If you need a customized setup, you can import `createRevai` from `@ai-sdk/revai` and create a provider instance with your settings:
35+
36+
```ts
37+
import { createRevai } from '@ai-sdk/revai';
38+
39+
const revai = createRevai({
40+
// custom settings, e.g.
41+
fetch: customFetch,
42+
});
43+
```
44+
45+
You can use the following optional settings to customize the Rev.ai provider instance:
46+
47+
- **apiKey** _string_
48+
49+
API key that is being sent using the `Authorization` header.
50+
It defaults to the `REVAI_API_KEY` environment variable.
51+
52+
- **headers** _Record&lt;string,string&gt;_
53+
54+
Custom headers to include in the requests.
55+
56+
- **fetch** _(input: RequestInfo, init?: RequestInit) => Promise&lt;Response&gt;_
57+
58+
Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
59+
Defaults to the global `fetch` function.
60+
You can use it as a middleware to intercept requests,
61+
or to provide a custom fetch implementation for e.g. testing.
62+
63+
## Transcription Models
64+
65+
You can create models that call the [Rev.ai transcription API](https://www.rev.ai/docs/api/transcription)
66+
using the `.transcription()` factory method.
67+
68+
The first argument is the model id e.g. `machine`.
69+
70+
```ts
71+
const model = revai.transcription('machine');
72+
```
73+
74+
You can also pass additional provider-specific options using the `providerOptions` argument. For example, supplying the input language in ISO-639-1 (e.g. `en`) format can sometimes improve transcription performance if known beforehand.
75+
76+
```ts highlight="6"
77+
import { experimental_transcribe as transcribe } from 'ai';
78+
import { revai } from '@ai-sdk/revai';
79+
import { readFile } from 'fs/promises';
80+
81+
const result = await transcribe({
82+
model: revai.transcription('machine'),
83+
audio: await readFile('audio.mp3'),
84+
providerOptions: { revai: { language: 'en' } },
85+
});
86+
```
87+
88+
The following provider options are available:
89+
90+
- **metadata** _string_
91+
92+
Optional metadata that was provided during job submission.
93+
94+
- **notification_config** _object_
95+
96+
Optional configuration for a callback url to invoke when processing is complete.
97+
98+
- **url** _string_ - Callback url to invoke when processing is complete.
99+
- **auth_headers** _object_ - Optional authorization headers, if needed to invoke the callback.
100+
101+
- **delete_after_seconds** _integer_
102+
103+
Amount of time after job completion when job is auto-deleted.
104+
105+
- **verbatim** _boolean_
106+
107+
Configures the transcriber to transcribe every syllable, including all false starts and disfluencies.
108+
109+
- **rush** _boolean_
110+
111+
[HIPAA Unsupported] Only available for human transcriber option. When set to true, your job is given higher priority.
112+
113+
- **skip_diarization** _boolean_
114+
115+
Specify if speaker diarization will be skipped by the speech engine.
116+
117+
- **skip_postprocessing** _boolean_
118+
119+
Only available for English and Spanish languages. User-supplied preference on whether to skip post-processing operations.
120+
121+
- **skip_punctuation** _boolean_
122+
123+
Specify if "punct" type elements will be skipped by the speech engine.
124+
125+
- **remove_disfluencies** _boolean_
126+
127+
When set to true, disfluencies (like 'ums' and 'uhs') will not appear in the transcript.
128+
129+
- **remove_atmospherics** _boolean_
130+
131+
When set to true, atmospherics (like `<laugh>`, `<affirmative>`) will not appear in the transcript.
132+
133+
- **filter_profanity** _boolean_
134+
135+
When enabled, profanities will be filtered by replacing characters with asterisks except for the first and last.
136+
137+
- **speaker_channels_count** _integer_
138+
139+
Only available for English, Spanish and French languages. Specify the total number of unique speaker channels in the audio.
140+
141+
- **speakers_count** _integer_
142+
143+
Only available for English, Spanish and French languages. Specify the total number of unique speakers in the audio.
144+
145+
- **diarization_type** _string_
146+
147+
Specify diarization type. Possible values: "standard" (default), "premium".
148+
149+
- **custom_vocabulary_id** _string_
150+
151+
Supply the id of a pre-completed custom vocabulary submitted through the Custom Vocabularies API.
152+
153+
- **custom_vocabularies** _Array_
154+
155+
Specify a collection of custom vocabulary to be used for this job.
156+
157+
- **strict_custom_vocabulary** _boolean_
158+
159+
If true, only exact phrases will be used as custom vocabulary.
160+
161+
- **summarization_config** _object_
162+
163+
Specify summarization options.
164+
165+
- **model** _string_ - Model type for summarization. Possible values: "standard" (default), "premium".
166+
- **type** _string_ - Summarization formatting type. Possible values: "paragraph" (default), "bullets".
167+
- **prompt** _string_ - Custom prompt for flexible summaries (mutually exclusive with type).
168+
169+
- **translation_config** _object_
170+
171+
Specify translation options.
172+
173+
- **target_languages** _Array_ - Array of target languages for translation.
174+
- **model** _string_ - Model type for translation. Possible values: "standard" (default), "premium".
175+
176+
- **language** _string_
177+
178+
Language is provided as a ISO 639-1 language code. Default is "en".
179+
180+
- **forced_alignment** _boolean_
181+
182+
When enabled, provides improved accuracy for per-word timestamps for a transcript.
183+
Default is `false`.
184+
185+
Currently supported languages:
186+
187+
- English (en, en-us, en-gb)
188+
- French (fr)
189+
- Italian (it)
190+
- German (de)
191+
- Spanish (es)
192+
193+
Note: This option is not available in low-cost environment.
194+
195+
### Model Capabilities
196+
197+
| Model | Transcription | Duration | Segments | Language |
198+
| ---------- | ------------------- | ------------------- | ------------------- | ------------------- |
199+
| `machine` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
200+
| `human` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
201+
| `low_cost` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
202+
| `fusion` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |

‎examples/ai-core/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
"@ai-sdk/perplexity": "2.0.0-canary.8",
2424
"@ai-sdk/provider": "2.0.0-canary.7",
2525
"@ai-sdk/replicate": "1.0.0-canary.8",
26+
"@ai-sdk/revai": "1.0.0-canary.0",
2627
"@ai-sdk/togetherai": "1.0.0-canary.8",
2728
"@ai-sdk/xai": "2.0.0-canary.8",
2829
"@ai-sdk/valibot": "1.0.0-canary.9",
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import { revai } from '@ai-sdk/revai';
2+
import { experimental_transcribe as transcribe } from 'ai';
3+
import 'dotenv/config';
4+
import { readFile } from 'fs/promises';
5+
6+
async function main() {
7+
const result = await transcribe({
8+
model: revai.transcription('machine'),
9+
audio: Buffer.from(await readFile('./data/galileo.mp3')).toString('base64'),
10+
});
11+
12+
console.log('Text:', result.text);
13+
console.log('Duration:', result.durationInSeconds);
14+
console.log('Language:', result.language);
15+
console.log('Segments:', result.segments);
16+
console.log('Warnings:', result.warnings);
17+
console.log('Responses:', result.responses);
18+
console.log('Provider Metadata:', result.providerMetadata);
19+
}
20+
21+
main().catch(console.error);
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
import { revai } from '@ai-sdk/revai';
2+
import { experimental_transcribe as transcribe } from 'ai';
3+
import 'dotenv/config';
4+
5+
async function main() {
6+
const result = await transcribe({
7+
model: revai.transcription('machine'),
8+
audio: new URL(
9+
'https://github.com/vercel/ai/raw/refs/heads/main/examples/ai-core/data/galileo.mp3',
10+
),
11+
});
12+
13+
console.log('Text:', result.text);
14+
console.log('Duration:', result.durationInSeconds);
15+
console.log('Language:', result.language);
16+
console.log('Segments:', result.segments);
17+
console.log('Warnings:', result.warnings);
18+
console.log('Responses:', result.responses);
19+
console.log('Provider Metadata:', result.providerMetadata);
20+
}
21+
22+
main().catch(console.error);
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import { revai } from '@ai-sdk/revai';
2+
import { experimental_transcribe as transcribe } from 'ai';
3+
import 'dotenv/config';
4+
import { readFile } from 'fs/promises';
5+
6+
async function main() {
7+
const result = await transcribe({
8+
model: revai.transcription('machine'),
9+
audio: await readFile('data/galileo.mp3'),
10+
});
11+
12+
console.log('Text:', result.text);
13+
console.log('Duration:', result.durationInSeconds);
14+
console.log('Language:', result.language);
15+
console.log('Segments:', result.segments);
16+
console.log('Warnings:', result.warnings);
17+
console.log('Responses:', result.responses);
18+
console.log('Provider Metadata:', result.providerMetadata);
19+
}
20+
21+
main().catch(console.error);

‎examples/ai-core/tsconfig.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,15 @@
3737
{
3838
"path": "../../packages/elevenlabs"
3939
},
40+
{
41+
"path": "../../packages/revai"
42+
},
4043
{
4144
"path": "../../packages/cohere"
4245
},
46+
{
47+
"path": "../../packages/revai"
48+
},
4349
{
4450
"path": "../../packages/deepinfra"
4551
},

‎packages/revai/README.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# AI SDK - Rev.ai Provider
2+
3+
The **[Rev.ai provider](https://sdk.vercel.ai/providers/ai-sdk-providers/revai)** for the [AI SDK](https://sdk.vercel.ai/docs)
4+
contains language model support for the Rev.ai transcription API.
5+
6+
## Setup
7+
8+
The Rev.ai provider is available in the `@ai-sdk/revai` module. You can install it with
9+
10+
```bash
11+
npm i @ai-sdk/revai
12+
```
13+
14+
## Provider Instance
15+
16+
You can import the default provider instance `revai` from `@ai-sdk/revai`:
17+
18+
```ts
19+
import { revai } from '@ai-sdk/revai';
20+
```
21+
22+
## Example
23+
24+
```ts
25+
import { revai } from '@ai-sdk/revai';
26+
import { experimental_transcribe as transcribe } from 'ai';
27+
28+
const { text } = await transcribe({
29+
model: revai.transcription('machine'),
30+
audio: new URL(
31+
'https://github.com/vercel/ai/raw/refs/heads/main/examples/ai-core/data/galileo.mp3',
32+
),
33+
});
34+
```
35+
36+
## Documentation
37+
38+
Please check out the **[Rev.ai provider documentation](https://sdk.vercel.ai/providers/ai-sdk-providers/revai)** for more information.

0 commit comments

Comments
 (0)