Together AI’s cover photo
Together AI

Together AI

Software Development

San Francisco, California 86,760 followers

Accelerate inference, model shaping, and pre-training on a research-optimized platform.

About us

Together AI is the AI Native Cloud, purpose-built for AI engineers and researchers with a full suite of tooling across inference, model shaping, and pre-training. AI natives can use Together AI as a full-stack AI platform — from a high- performance inference engine built for reliable and fast scaling to on-demand GPU clusters and massive-scale AI factories. Together AI continuously pushes the frontier forward by productizing cutting-edge research from our world-leading AI systems research team. By combining research velocity with production-grade infrastructure, we enable companies to reliably scale AI-native applications as fast as the field evolves. Trusted by leading AI natives like Cursor, Decagon, Eleven Labs, AI21, Hedra, and Cartesia, as well as SaaS innovators such as Salesforce, Zoom, and Zomato, Together AI powers the next generation of AI-native applications.

Website
https://together.ai
Industry
Software Development
Company size
201-500 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2022
Specialties
Artificial Intelligence, Cloud Computing, LLM, Open Source, and Decentralized Computing

Locations

  • Primary

    251 Rhode Island St

    Suite 205

    San Francisco, California 94103, US

    Get directions

Employees at Together AI

Updates

  • Together AI serves the two fastest speech-to-text models measured by Artificial Analysis. NVIDIA Parakeet-TDT 0.6B v3 on Together AI can transcribe ~20 hours of speech in under 10 seconds. This deep dive from Sebastien Beurnier shows the systems work behind the leaderboard: TensorRT profiles, conditional CUDA graphs, evented I/O, shared memory, and Python GC control.

  • The best research labs are building what comes after static models. Congrats to Trajectory on the launch! Excited to have them training on the AI Native Cloud as they push the frontier on Continual Learning!

    Today, Michael Elabd, Arjun Karanam, and I are excited to announce our $15M raise for Trajectory. We are a research lab and product company building the platform for Continual Learning. Our platform unlocks the signal already sitting in product usage, so companies can continuously train large-scale agentic models that outperform the frontier. We’ve raised $15M from Conviction, Bessemer Venture Partners, Radical Ventures, Jeff Dean, Fei-Fei Li and more. We’re partnering with some of the best AI-native companies: Clay, Harvey, Decagon, Mercor, and Rogo to power their agentic systems, some of which we are already in production with. We’ve brought together a world class research team from DeepMind, OpenAI, Apple, Meta Superintelligence, Amazon AGI, Scale AI, and an elite product team from Stripe and Figma. AI will never again start on day one. Every correction, every retry, every edit will make products smarter. This is Continual Learning.

  • Together AI reposted this

    🚨 New ICML 2026 Paper! Learning To Discover at Test Time. For a few hundred dollars of compute, we made an open-source 120B model climb four hard scientific leaderboards. 1. It wrote a GPU kernel 2× faster than the best human-written kernel on the GPUMode leaderboard. 2. It beat DeepMind's AlphaEvolve on a 70-year-old open Erdős problem. An improvement 16× larger than AlphaEvolve's. 3. It placed 1st-equivalent in two AtCoder Heuristic Contests against hundreds of expert humans, ahead of the previous best AI. 4. It set new state-of-the-art on single-cell RNA-seq denoising. Same algorithm. Same open-source model (gpt-oss-120b). The method, TTT-Discover, is reinforcement learning, at inference, on a single problem. Accepted at ICML 2026. Paper, code, every result tied to its public verifier: 🔗 https://lnkd.in/dGSmjHyn With Mert Yuksekgonul, Daniel Koceja, Xinhao Li, Jed McCaleb, Xiaolong Wang, Jan Kautz, Yejin Choi, James Zou, Carlos Guestrin and Yu Sun. Stanford · NVIDIA · Astera Institute · UC San Diego · Together AI.

  • We’re excited to announce Qwen3.7-Max on Together AI 🚀 AI natives can now deploy Qwen’s flagship model for the agent era on Together Serverless Inference and benefit from reliable infrastructure for long-horizon coding, reasoning, and autonomous workflows. Highlights: → Long-horizon autonomy: maintained coherent execution across a 35-hour autonomous kernel optimization run → Agentic coding: leading Terminal-Bench 2.0-Terminus performance for terminal-based engineering workflows → General agent workflows: strong tool orchestration, office automation, and spreadsheet reasoning → 1M context: built for longer tasks, larger working sets, and persistent agent workflows Try it now: https://lnkd.in/gFRA-pbs

    • No alternative text description for this image
  • View organization page for Together AI

    86,760 followers

    We’re excited to announce MiniMax Speech 2.8 Turbo on Together AI. AI natives can now deploy MiniMax’s enterprise TTS model on Together AI dedicated infrastructure for expressive, real-time voice agents. With MiniMax Speech 2.8 Turbo, teams get: → Sound Tags for laughter, breathing, sighs, gasps, and other vocal cues → 60% prosody improvement over Speech 2.6 → Sub-250ms end-to-end latency with streaming support → 40+ language support for global voice applications Try it in voice finder: https://lnkd.in/g3vZdrj3

  • "One thing that we've been seeing recently is that inference benchmarks don't really match production workloads that well." - Dan Fu, VP of Kernels When you're running dozens of concurrent coding agents — each with 45k–200k token contexts — the benchmarks that matter are the ones that stress KV cache, scheduler limits, and throughput under real load. We ran those benchmarks. Our Inference Engine delivered: → 31% higher throughput than the next fastest open source engine → 2× better time-to-first-token at saturation → 76% lower cost per request compared to Claude Opus 4.6 Read the full technical breakdown → https://lnkd.in/gEQxp8Sp

  • Congrats to the Cursor team on Composer 2.5 — a huge milestone for agentic coding models. Together AI, the AI Native Cloud, is proud to partner on this launch. Composer 2.5 is pushing the frontier for coding agents and turning heads for its speed and quality. Excited to keep building with the Cursor team!

    View organization page for Cursor

    319,314 followers

    Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model. Learn more about Composer 2.5: https://lnkd.in/esfiRv7F

    • No alternative text description for this image

Similar pages

Browse jobs

Funding