Text-to-Speech n8n node powered by sherpa-onnx and the Kokoro TTS model.
🎯 Pure JavaScript/WebAssembly - No native binary dependencies!
⚡ High Performance - Singleton TTS instance for fast subsequent calls
🔒 Offline - Runs completely locally, no API calls needed
🐳 Docker Ready - Works in containerized n8n environments
- High-Quality Neural TTS using Kokoro-82M model
- Multiple Voices - 10+ speaker voices available
- Adjustable Speed - 0.5x to 2.0x speech speed control
- Multiple Output Formats - WAV or Raw PCM
- Binary Output - Audio data as n8n binary property
- Works in Docker - Pure WASM, no native dependencies
- Go to Settings > Community Nodes
- Enter
n8n-nodes-ttsbro - Click Install
| Tutorial | Link |
|---|---|
| How to Install TTS Bro Node in n8n | ![]() |
| How to use TTS bro node in n8n | ![]() |
| How to build a selfhosted TTS API | ![]() |
Use the provided Dockerfile and docker-compose.yml:
# Download Kokoro model (304MB)
npm run download-model
# Build and run
docker-compose up --build# Install the node
cd ~/.n8n/custom
npm install n8n-nodes-ttsbro
# Download the model
cd node_modules/n8n-nodes-ttsbro
npm run download-model
# Restart n8n-
Add the TTS Bro node to your workflow
-
Configure:
- Text: The text to convert to speech
- Voice: Select a speaker voice (0-9)
- Speed: Speech speed (default: 1.0)
- Output Format: WAV or Raw PCM
- Binary Property: Name for the output (default: "audio")
-
Output is a binary audio file that can be:
- Saved to disk
- Uploaded to cloud storage
- Sent via messaging apps
- Played in browsers
| Property | Type | Default | Description |
|---|---|---|---|
| Text | string | - | Text to synthesize (required) |
| Voice | options | Voice 0 | Speaker voice selection |
| Speed | number | 1.0 | Speech speed (0.5-2.0) |
| Format | options | WAV | Output format (WAV/Raw PCM) |
| Binary Property | string | audio | Output property name |
{
"json": {
"text": "Hello world!",
"voice": 0,
"speed": 1.0,
"format": "wav",
"sampleRate": 24000,
"duration": 1.23,
"byteLength": 54382
},
"binary": {
"audio": { ... } // Binary audio data
}
}- TTS Engine: sherpa-onnx via WebAssembly
- Model: Kokoro-82M (English, multi-voice)
- Sample Rate: 24000 Hz
- Bit Depth: 16-bit
- Channels: Mono
The Kokoro model is:
- 82 million parameters - Compact yet high quality
- Apache 2.0 licensed - Free for commercial use
- Multi-voice - Multiple speaker styles
- English focused - Optimized for English text
- Node.js >= 18
- n8n >= 1.0.0
- ~150MB disk space for model files
# Clone and install
git clone https://github.com/your-username/n8n-nodes-ttsbro.git
cd n8n-nodes-ttsbro
npm install
# Download model
npm run download-model
# Build
npm run build
# Run with n8n
npm run startThis project is licensed under the Apache License 2.0.
Important
This distribution includes components and models with different licenses. Please see the NOTICE file for full third-party attribution and license details.
| Component | License | Notes |
|---|---|---|
| sherpa-onnx | Apache 2.0 | TTS inference engine |
| Kokoro-82M | Apache 2.0 | TTS Model weights |
| ONNX Runtime | MIT | Neural network inference runtime |
| eSpeak NG | GPL v3 | Data/Phonemes used by the model |
Note on GPL Compatibility: The Kokoro model utilizes data derived from eSpeak NG (GPL v3). If you modify and redistribute the model files or this package, you must comply with the terms of the GPL v3 where applicable.
- sherpa-onnx - For the amazing WebAssembly TTS engine.
- hexgrad - For training and releasing the Kokoro model.
- n8n - For the workflow automation platform.


