Built a hands-free voice panel for the Copilot CLI (canvas) — feedback on mic/Web Speech support #200628
Replies: 1 comment
-
|
💬 Your Product Feedback Has Been Submitted 🎉 Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users. Here's what you can expect moving forward ⏩
Where to look to see what's shipping 👀
What you can do in the meantime 💻
As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities. Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐ |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
🏷️ Discussion Type
Product Feedback
💬 Feature/Topic Area
Copilot CLI
Body
I built an open-source extension called Vox on the Copilot CLI + canvas surface (createCanvas / joinSession), and wanted to share it here as a bit of a show-and-tell plus some genuine product feedback about building voice UIs on this surface.
What it does: run
/voxin a Copilot CLI session and a reactive orb opens in its own window — you speak your turn, the active session hears it, and the reply is read back to you. Voice in, voice out. It also renders inside the Copilot app's canvas panel. Extras: barge-in (Esc / tap the orb to interrupt), live captions, it reads your typed replies aloud too, and it routes across multiple live sessions.The interesting constraint (and the feedback): the browser Web Speech APIs (
SpeechRecognition/speechSynthesis) are the zero-dependency way to get speech-to-text + text-to-speech, but they don't work in Electron or most native webviews — which is where a canvas panel effectively runs. My workaround was to not use the webview for audio: I launch an installed Chromium in app mode (--app=, Chrome then Edge fallback) under a dedicated--user-data-dirprofile that remembers the mic permission. Real Chrome under the hood → the speech APIs just work, and the whole thing stays pure JS with no build step and a one-line install on Win/macOS/Linux.Things I'd love guidance or discussion on:
Repo (MIT): https://github.com/aasis21/vox
Demo/site: https://aasis21.github.io/vox/
I'm the author — happy to share implementation details, and very open to feedback on the approach.
Beta Was this translation helpful? Give feedback.
All reactions