We’re running a 24-hour hackathon on June 19–20 in San Francisco in partnership with Cognition, Etched, and Anthropic. There’s three tracks to build in: - Agents: Multi-step systems that take a goal and execute across tools, plans, and long horizons. - Real-Time and Interactive: Systems that respond in the moment, across modalities. - Talent Marketplace + Applied AI: Systems that measure what people and models can actually do. Autograders, rubrics, skill verification, and human-in-the-loop evaluation. Our guest judges include: Brendan Foody, Robert W. and Steven Hao. We’re offering a $50k top prize with $100k+ in total awards. Every accepted team gets 8xH100s, Cognition API access, and Anthropic credits. Come build with us. Apply by 6/12 at the link in the comments.
Mercor
Software Development
San Francisco, California 736,883 followers
We connect people with the leading AI labs and enterprises to provide the human expertise essential to AI development.
About us
Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $2 million a day.
- Website
-
mercor.com
External link for Mercor
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- San Francisco, California
- Type
- Privately Held
- Founded
- 2023
Locations
-
Primary
Get directions
San Francisco, California 94105, US
Employees at Mercor
Updates
-
We tested Anthropic Claude Opus 4.8 on APEX-Agents and APEX-SWE ahead of yesterday’s release. On APEX-SWE (High), our benchmark for real-world software engineering work, it takes the #1 spot at 45.3% Pass@1, nearly 4 points ahead of GPT-5.3 Codex. APEX-SWE tests two categories: Integration (building and connecting systems) and Observability (diagnosing and debugging). Opus 4.8 leads Observability by 10 percentage points. On APEX-Agents (Max), our benchmark for long-horizon professional work, it places 2nd at 42.5% Pass@1, behind Gemini 3.5 Flash (49.6%) and ahead of GPT 5.5 (38.4%). Claude Opus 4.8 has improved 8.6 percentage points over Opus 4.7 (33.9%). In ~6 months, Opus models have improved from 18.3% to 42.5% on APEX-Agents. Check out the leaderboards at the links in the comments.
-
Active contracts on Mercor grew 200x in under two years. The system that handles every hire, extension, transfer, and dismissal had to keep up. In late July 2025, our engineering team rewrote the Contracts service in one week, with zero regressions in production. The new system was 75x more reliable and engineering KTLO went from 60% of the team's time to near zero. Pouria P. shares how we did it in our latest engineering blog: https://lnkd.in/gRapis6v We're hiring engineers who want to work on problems like this. Check out our open roles at the link in the comments.
-
Mercor reposted this
Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it. https://lnkd.in/gb_BYH-k
-
-
Google DeepMind Gemini 3.5 Flash (High) is #1 on the APEX-Agents leaderboard. 49.6% Pass@1 across 480 tasks in investment banking, corporate law, and management consulting. That's the highest score we've recorded on APEX-Agents, and it comes from a Flash-tier model, not a flagship. 🥇 Gemini 3.5 Flash (High): 49.6% 🥈 GPT-5.5 (xHigh): 38.4% 🥉 GPT-5.4 (xHigh): 36.0% It tops all three domain leaderboards too. Investment banking at 57.0% (15 pp ahead of #2), management consulting at 55.5%, corporate law at 36.4%. 77% of Flash's tool calls were code execution. Where other models used a mix of tools, Flash treated nearly every task as a coding problem. It averaged 33 minutes per task vs. 10 for other frontier models and used 2-3x the tokens. Check out the full leaderboard at the link in the comments.
-
-
Mercor reposted this
I consistently receive emails from experts about how much more they enjoy working with Mercor relative to competitors (names redacted). We pay out over $3M per day, paying experts over $100 / hour on average, with industry-leading NPS. Building the best RL Environments to push the frontier of AI requires building a platform that the best experts want to work on.
-
-
Training AI is the fastest-growing category of work right now. Most of the conversation around AI and jobs has focused on displacement. But we see it differently because a new type of work is already here. Our experts train AI systems to do economically valuable tasks across law, medicine, finance, consulting, software engineering, and more. Our expert network earns more than $2M per day, with an average pay rate above $100/hr. Find opportunities on Mercor at the link in the comments.
-
Mercor reposted this
In a world of AI slop, models that produce something beautiful are rare. But are they prized? I’m excited to share my recently released paper, "Artificial Aesthetics” which set out to investigate whether people will pay more for beautiful outputs and what people define as a “beautiful output”. Alongside, “The Furniture Turn” which presents a historical connection of today’s LLM market to the 1960’s American TV market. We designed an incentive compatible experiment of AI users to find that beauty remains in the eyes of the beholder and people gravitate toward utility over beauty in LLM output. I am incredibly grateful to have been advised by Dr. Eric Maskin (2007 Nobel Laureate), Dr. Amartya Sen (1999 Nobel Laureate), Dr. Emma Rothschild (Director of Joint Centre for History and Economics at Harvard), and Dr. Ian Kumekawa (Harvard/MIT). Also a special thank you to Mercor and Bertie Vidgen for your partnership in this research. It was a pleasure to be able to utilize my academic background in economics and history to my interests in AI and markets. A couple of our most interesting figures are below. We’d love to hear your thoughts and comments! Links to the papers and datasets in the comments.
-
-
Mercor reposted this
AI is rapidly moving from automating repetitive work to handling complex knowledge work. The pace of progress is accelerating faster than most people realize. In a fireside chat at Startup Grind with Felicis Managing Partner Sundeep Peechu, Mercor co-founder & co-CEO Adarsh Hiremath shares his perspective on how rapidly AI capabilities are compounding and what that unlocks for the future of work.