Coverfoto van Stream
Stream

Stream

Technologie, informatie en internet

Boulder, CO 18.001 volgers

Stream powers Chat, AI Moderation, Activity Feeds, Video & Audio for billions of global end-users.

Over ons

Stream helps apps build real-time experiences that scale. Our chat, moderation, video, audio, and activity feed APIs and SDKs are powered by a global edge network and enterprise-grade infrastructure. Our platform empowers developers with the flexibility and scalability they need to easily build rich conversations and engaging communities.

Website
https://getstream.io
Branche
Technologie, informatie en internet
Bedrijfsgrootte
51 - 200 medewerkers
Hoofdkantoor
Boulder, CO
Type
Particuliere onderneming
Opgericht
2015
Specialismen
Activity Streams, Newsfeeds, Cloud Hosting en Big Data

Locaties

Medewerkers van Stream

Updates

  • Learn how to quickly build a realtime multimodal agent using Gemini 3.1 Flash and Stream’s Vision Agents SDK: https://lnkd.in/gcibeFVm

    Organisatiepagina weergeven voor Google for Developers.

    3.854.827 volgers

    Build a real-time voice agent with Gemini 3.1 Flash Live and Stream's Vision Agents SDK using Stefan Blos’s walkthrough to move from early access to a fully orchestrated multi-step workflow. What’s covered: ✨ Setting up the Vision Agents SDK with the Gemini plugin ✨ Defining tools for image generation and product search ✨ Building a video processor to analyze live frames via Next.js and WebSockets Grab Gemini API keys at Google AI Studio and explore the Vision Agents SDK from Stream to start building. Watch the full video: https://goo.gle/4m4GpgH

    • Large text reads "Build real-time apps with Gemini & the Vision Agents SDK" next to the Google AI Studio logo on a black background with scattered glowing dots. To the right, a window displays Python code for an object-capture processor, and a small inset in the bottom-right corner shows a smiling person.
  • TikTok Live Shopping set the bar. Real-time video. Live chat. Instant checkout - without leaving the stream. And it's not just TikTok. You can build the same experience into your own app. We put together a breakdown of how to do it using Stream, combining livestream video, chat, and in-stream commerce. If livestreams are on your roadmap, check it out: 🔗https://lnkd.in/e7mEAtzq

    • Geen alternatieve tekst opgegeven voor deze afbeelding
  • This is why we build. Seeing developers go from stuck → shipping again is everything. Huge shoutout to the Cognivise journey here, especially pushing the boundaries of real-time AI with vision + speech. Keep building. 🚀

    Profiel weergeven voor Vicky Kumar

    Update on Cognivise: Building the English AI Coach A short time ago, I was dealing with personal issues and depression that completely blocked me from writing code. I had stopped developing entirely. But thankfully, being introduced to the Stream(Vision_AI_SDK(https://lnkd.in/gwdp7yjz)) by Kunal Kushwaha during the WeMakeDevs hackathon reignited my spark. I am finally building again. My ultimate goal remains the same: to create the most accurate real-time cognitive monitoring system for students and learners. Today, I’m excited to share my latest major addition to Cognivise. 🗣 New Feature: The English AI Coach I wanted to see if I could build a conversational AI that doesn't just listen to your pronunciation, but literally watches your face as you speak. Here is what the English Coach is doing under the hood: 🔹 Dual-AI Architecture: Combining local browser processing (MediaPipe FaceMesh) for 30fps physical tracking, synced with a cloud LLM (Gemini/Groq) for speech parsing and reasoning. 🔹 Canvas Layer Multiplexing: Built 4 distinct visual feedback modes (Composite, Landmarks-only, Full HUD Analysis, Raw Frame) by manipulating HTML5 canvas layers behind the live WebRTC stream. 🔹 Real-time Synchronicity: Overcame strict browser autoplay policies and synchronous call-stack limits to ensure the AI's Text-To-Speech (TTS) responds naturally to the user's analyzed audio. ___________ 🛠 What Was Actually Hard This Time Getting the Vision SDK to accurately track full movement is incredible. But translating that raw physical movement—every micro-expression and blink—into an accurate emotional state is intensely difficult. Right now, the system catches the physical layout perfectly, but the semantic "emotion tracking" is still not exactly where I expect it to be. Sending frame-by-frame snapshot analysis over WebSockets without colliding with the synchronous audio pipelines required deep debugging of React state refs and Canvas requestAnimationFrame loops. Testing this in a remote location with an unstable internet connection makes debugging real-time video+WebSocket systems extremely painful. I am learning exactly why tracking every single frame matters, and how quickly it scales in complexity. _________________________________________________________________ ⚠️ Honest Status & What's Next Cognivise is not perfect yet. Capturing true "emotion" is a work in progress. But getting out of my slump was step one. I’m heading back to my hometown in Delhi soon, and I now have 3 brand new AI project ideas I plan to start sprinting on this month. ----------------------------------------------------------------------------- Massive thank you to Kunal Kushwaha and the WeMakeDevs community for creating environments that push developers back into motion. The journey continues! 🔗 GitHub: https://lnkd.in/gZvJnCfa #AI #VisionAgents #VisionPossible #MachineLearning #WebRTC #FastAPI #ReactJS #BuildInPublic #OpenSource #algsoch #MentalHealthInTech

  • Real-time video for embedded devices is here! We just launched the Stream Video ESP32 SDK — an open-source C library that brings real-time video and audio directly to ESP32 microcontrollers. Resource-constrained devices powered by ESP32, like smart doorbells, security cameras, robots, and industrial systems, can join live video calls and publish streams — no intermediary server required. A few highlights: - Native WebRTC connection to Stream’s SFU - H.264 video + Opus audio (with echo cancellation) - Runs entirely on a single ESP32-S3 or ESP32-P4 chip - Minimal API — go from zero → live in just 4 function calls - Install via ESP-IDF with a single dependency line We’re just getting started. v0.1.0 focuses on publishing, with subscriptions, data channels, and more coming soon. Check it out 📌 - GitHub: https://lnkd.in/eKGBAvq6 - Stream Docs: https://lnkd.in/eMDjq3pW

    • Geen alternatieve tekst opgegeven voor deze afbeelding
  • Let’s build a vision + voice agent with the new Gemini 3.1 Flash Live model. Following this tutorial, you'll build a multimodal agent that helps you sell your used items! The new Google Gemini model uses native audio, so there's no conversion to/from text when you work with it, unlike with most models. In this demo, you'll build an agent that: - Understands text, audio, and video (1 FPS image input) - Responds in natural language (with optional transcripts) - Uses built-in tools like Google Search for real-time info - Enables server-side + client-side function calling - Maintains persona + accent stability across long sessions Check out the full tutorial on the Google for Developers YouTube channel!

    • Geen alternatieve tekst opgegeven voor deze afbeelding
  • Stream heeft dit gerepost

    We’re happy to welcome Stream as a Gold Sponsor of #iOSKonf26 💙 This year’s theme, “AI & Human Sensations,” is about something we often overlook as engineers, not just what we build, but how it feels to use it. Stream works right at that intersection. Their tools for chat, video, and activity feeds help developers build real-time experiences that feel immediate, responsive, and human, qualities that matter even more in an AI-driven world. Looking forward to building and learning together this May in Skopje 🇲🇰 You can get your ticket: https://www.ioskonf.mk Learn more about Stream: https://getstream.io/

    • Geen alternatieve tekst opgegeven voor deze afbeelding
  • Stream heeft dit gerepost

    Profiel weergeven voor Amos Gyamfi

    Mirum Agency7K volgers

    Generate music with Lyria 3 models in Vision Agents. Here is a 30-second music demo. You can generate full-length songs with verses, choruses, and bridges using the new lyria-3-pro-preview in the Gemini API. Lyria 3 API: https://lnkd.in/deCnHcm8 Vision Agents demo: https://lnkd.in/dGd56VFQ Integrate Lyria 3 as a custom Vision Agents plugin: https://lnkd.in/dvmMatRg #gemini, #ai, #music, #llm

  • Organisatiepagina weergeven voor Stream.

    18.001 volgers

    Gemini 3.1 Flash Live just dropped, check out our demo with it! This Google AI for Developers native audio model now comes with lower latency, stronger instruction following and more reliable tool calling. Some highlights: - Higher task completion rates in real-world environments - Responds in natural language - Uses built-in tools like Google Search for real-time info - More reliable function calling - Better guardrails and better instruction following - Multiple languages - Lower latency than ever before!

Vergelijkbare pagina’s

Door vacatures bladeren