Cactus (YC S25)

Research Services

San Francisco, California 2,831 followers

Low-latency AI engine for mobile devices & wearables

Discover all 31 employees

About us

Low-latency AI engine for mobile devices & wearables.

Website: https://www.cactuscompute.com
External link for Cactus (YC S25)
Industry: Research Services
Company size: 2-10 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2025

Locations

Primary

San Francisco, California , US

Get directions
London, GB

Get directions

Employees at Cactus (YC S25)

See all employees

Updates

Cactus (YC S25)

2,831 followers
1d
Report this post
Satyajit Kumar explains how NVIDIA Parakeet 1.1B runs at <200ms latency and up to 6m toks/sec decode speed with Cactus on mobile devices and Macs...yes 6m! Demo for your self by following the instruction here: https://lnkd.in/gWcQM3un

4 Comments

Like Comment Share
Cactus (YC S25)

2,831 followers
4d Edited
Report this post
Noah Cylich walks us through using Liquid AI’s LFM2-24B-A2B model for coding locally on your Mac with OpenCode integration, powered by Cactus. Learn more: https://lnkd.in/gd2gX5db

Like Comment Share
Cactus (YC S25) reposted this
Liquid AI

31,271 followers
4d
Report this post
Today, we released our largest LFM2 model: LFM2-24B-A2B, a 24B Mixture-of-Experts model with 2.3B parameters active per token, built on our hybrid, hardware-aware LFM2 architecture. By activating only the most relevant parameters at runtime, LFM2-24B-A2B delivers large-model capability with fast, memory-efficient behavior in a 32GB, 2B-active footprint. From day zero, we’re making deploying LFM2-24B-A2B easy, with support from key partners: ☁️ Easily deploy in the cloud: > Together AI: serverless production deployment. Get started here: https://lnkd.in/eWeaGSDg > Modal: elastic GPU infrastructure with low-latency serving. Get started here: https://lnkd.in/ezDVbVid 💻 Easily deploy locally: > AMD: optimized for CPU, GPU, and NPU on Ryzen AI platform > Intel Corporation: supported via OpenVINO across AI PCs and data centers. Learn more here: https://lnkd.in/eFpUGxda > Qualcomm: optimized for AI PCs and high-end mobile 🧠 Access and run LFM2-24B-A2B across top platforms with: > Cactus (YC S25): Check out their guide to coding agents with LFM2-24B-A2B https://lnkd.in/eQdB7XAj > Ollama: Download LFM2-24B-A2B https://lnkd.in/euXqHeq6 > LM Studio: Download LFM2-24B-A2B https://lnkd.in/etU2rQtz > Nexa AI: See LFM2-24B-A2B running on Qualcomm Snapdragon® 8 Elite For Galaxy device (Samsung Galaxy S25 Ultra) powered by the Qualcomm Hexagon NPU: https://lnkd.in/euA-x4jp With LFM2s and our ecosystem partners, deploying fast, scalable, efficient AI in production is easier than ever. Get started today. Read more on our ecosystem of partners here: https://lnkd.in/emeBjdV9
12 Comments

Like Comment Share
Cactus (YC S25)

2,831 followers
1w
Report this post
All cloud fallback is 𝗙𝗥𝗘𝗘 𝘁𝗵𝗶𝘀 𝗙𝗲𝗯𝗿𝘂𝗮𝗿𝘆. Seriously. Make us regret this 😅 We just launched Hybrid Cloud inference and we're too excited for you to try it. 1. Go to www.cactuscompute.com 2. Sign up and create a key 3. Run unlimited on-device transcription and LLM inference with cloud fallback Cactus Hybrid Cloud runs inference on-device by default, as always. If the on-device model struggles, it automatically hands off inference to the cloud. Demo for yourself: 𝗯𝗿𝗲𝘄 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗰𝗮𝗰𝘁𝘂𝘀-𝗰𝗼𝗺𝗽𝘂𝘁𝗲/𝗰𝗮𝗰𝘁𝘂𝘀/𝗰𝗮𝗰𝘁𝘂𝘀 𝗰𝗮𝗰𝘁𝘂𝘀 𝘁𝗿𝗮𝗻𝘀𝗰𝗿𝗶𝗯𝗲
Like Comment Share
Cactus (YC S25) reposted this
Henry Ndubuaku
1w Edited
Report this post
Roman Shemet demos Cactus Hybrid Inference, observe the latency & toks/sec, use provided commands to reproduce! v1.7 cost our blood, sweat and tears; - No sense of work hours for cactus Jacks. - Piles of feedback from users, no one else is solving. - Mentally fatigued everyone with my yelling :( - Tight deadlines to submit 6 research papers. - Slack threads were reaching 300 replies, easily. - Community members reaching out to ask if we're ok. Shout out to the Cactus Pods; from UCLA, UMichigan, UPenn, UCI, Yale, UWaterloo, Imperial, CU Boulder etc., who all now officially co-maintain Cactus with Cactus (YC S25). They really picked up our slouch! Come build with us this Saturday at the Google DeepMind x Cactus Hackathon across multiple cities, 1200 registered teams already across San Francisco, UCL/London, MIT/Boston, UMaryland, NUS/Singapore and online.

16 Comments

Like Comment Share
Cactus (YC S25)

2,831 followers
1w
Report this post
Run ‘brew install cactus-compute/cactus/cactus’ in your Mac terminal to test and learn more.
1 Comment

Like Comment Share
Cactus (YC S25) reposted this
Vishal Lakshmi Narayanan
3w
Report this post
We are excited to partner up with Cactus (YC S25) for our AI MakerSpace. This is a great opportunity to, Work with the Cactus v1 SDK and API to build production-grade AI applications. Access high-performance compute resources used by top-tier researchers. #Partnership #AIMakerSpace
The AI Society at ASU

432 followers
3w

The AI Society is proud to partner with Cactus Compute, a YC-backed AI infrastructure company, to bring students hands-on experience building real-world AI systems. Over 9 weeks, participants will: • Build production-level AI projects • Access modern AI infrastructure & SDKs • Compete on weekly leaderboards • Earn shareable achievement badges • Learn from experienced mentors This is more than a program — it’s a launchpad.
Like Comment Share
Cactus (YC S25) reposted this
Henry Ndubuaku
3w Edited
Report this post
Cactus (YC S25) is synergising with Google DeepMind, AI Tinkerers and AI Nexus Community to bring you a global multi-city hackathon: 👇🏾 I’d love to say that DeepMind agreed to this because I’ve got an overwhelming charisma, but the reality; having collaborated with 8 teams at Google, met a ton of the executives and researchers, the people are nicer than they think. Being an annoyingly persistent person, this is only the first of more announcements to come. I bugged them so much that my email is probably blocked on their servers. But I did not get here in life by folding my arms while opportunisties fly by. Shout out to my new best friends: - Amit Vadi (🐐x ♾️) - Jake Laes (🐐) - James Unsworth (🐄) - Hindy Rossignol & his MIT Org - UCLA’s Bruin AI - All involved host orgs at UCL, NUS, Maryland etc. Learn More: https://luma.com/f0arqlwy
16 Comments

Like Comment Share
Cactus (YC S25) reposted this
Henry Ndubuaku
3w Edited
Report this post
Cactus (YC S25) v1.6 simplifies everything: 👇 1. Auto-RAG: when initializing Cactus, you can pass a .txt, .md or directory with multiple, which will be automatically chunked and indexed using our advanced memory-efficient Cactus Indexing algorithm, and Cactus Rank algorithm. 2. Cloud Fallback: we designed confidence algorithms which the model uses to introspect while generating, if making an error, it can decide in a few milliseconds to return "cloud_fallback = true" in which case you should route to a frontier model. 3. Real-time transcription: Cactus now has APIs for running transcription models, with as low as 200ms latency on Whisper Small and 60ms on Moonshine. 4. Comprehensive Response JSON: Each prompt returns function calls (if any), as well as benchmarks, RAM usage, etc. 5. Support for C/C++, Rust, Python, React, Flutter, Kotlin and Swift. Learn more: https://lnkd.in/dgct-Rb8
7 Comments

Like Comment Share
Cactus (YC S25) reposted this
Henry Ndubuaku
3w Edited
Report this post
Cactus (YC S25) v1.6 has been released, Cactus is now much more performant on cheaper devices. FunctionGemma & LFM2-350m are highly capable models for real-world agentic tasks on resource-constrained devices. Follow the repo: https://lnkd.in/e-XJqyT7
5 Comments

Like Comment Share

Funding

Cactus (YC S25) 1 total round

Last Round

Pre seed Oct 9, 2025

US$ 500.0K

Investors

Y Combinator

See more info on crunchbase

Cactus (YC S25)

Research Services

San Francisco, California 2,831 followers

Low-latency AI engine for mobile devices & wearables

About us

Locations

Employees at Cactus (YC S25)

Cédric Tempestini 🤝🚀🌟🏦

Henry Ndubuaku

Neel Shejwalkar

Roman Shemet

Updates

Join now to see what you are missing

Similar pages

Pally (YC S25)

Slashy (YC S25)

Motives

Deep Render

Interfere (YC S25)

AgentMail (YC S25)

VibeFlow (YC S25)

Blue (YC S25)

Channel3

Nothing

Funding