Agentic Coding Weeklyさんのカバー写真
Agentic Coding Weekly

Agentic Coding Weekly

ソフトウェア開発

Subscribe to get a free weekly email covering agentic coding tool and model updates. Join 500+ agentic engineers.

概要

Get focused updates on agentic coding tools, models, and workflows. No hype, no gossip. Just what helps you ship. Written by Prashant Anand, an experienced ML Engineer (ex-Mercari) who spent the last 6 years building and deploying production ML systems. What to expect: - Weekly issues every Monday covering what's new in tools and models, one workflow to try, and the best community posts from that week. - Occasional deep dives into specific tools, patterns, and best practices. Who this is for: - Engineers using Cursor, Claude Code, Copilot, or similar tools to write code professionally and want high-signal updates and tactics. - Engineering managers and technical leaders who want to understand the current landscape well enough to guide their teams, make informed decisions, and evaluate team performance.

ウェブサイト
https://www.agenticcodingweekly.com/
業種
ソフトウェア開発
会社規模
社員 1名
本社
Tokyo
種類
非上場企業

場所

Agentic Coding Weeklyの社員

アップデート

  • MiniMax has released M3, an open-weight model that pairs a 1M-token context window with native multimodal input and computer use. The headline architectural change is MSA (MiniMax Sparse Attention), which the lab says cuts per-token compute at 1M context to 1/20 of the previous generation, yielding 9× faster prefill and 15× faster decode. M3 supports a toggleable thinking mode, on for agentic and long-horizon work, off for latency-sensitive completion, at the same price. Weights and a technical report will be released within 10 days. On coding benchmarks: - M3 scores 59.0% on SWE-Bench Pro, ahead of GPT-5.5 (58.6), Gemini 3.1 Pro (54.2), and the open-weight DeepSeek V4 Pro (55.4), GLM 5.1 Thinking (58.4), and Kimi K2.6 Thinking (58.6), but well behind Claude Opus 4.8 (69.2). - On Terminal-Bench 2.1 it reaches 66.0, behind GPT-5.5 (78.2), Opus 4.8 (74.6), and Gemini 3.1 Pro (70.3). M3 leads the open-weight field but GPT-5.5 and Opus 4.8 stay ahead on software-engineering tasks. MiniMax leans on long-horizon autonomy as the differentiator. In internal tests, M3 ran roughly 24 hours optimizing an FP8 GEMM CUDA kernel across 147 benchmark submissions and 1,959 tool calls, lifting Hopper peak utilization from 7.6% to 71.3% (9.4× speedup). Unlike most models, which stalled within 30 submissions, its best result came on submission 145. It also ran ~12 hours to reproduce an ICLR paper. API pricing is tiered by input length. At ≤512K input tokens it runs $0.60/M input, $2.40/M output, and $0.12/M cached reads (currently discounted 50% for 7 days to $0.30/$1.20/$0.06); above 512K it doubles to $1.20/$4.80/$0.24.

    • この画像には代替テキストの説明がありません
  • Agentic Coding Weeklyさんが再投稿しました

    Prashant Anandさんのプロフィールを表示

    Writing Agentic Coding Weekly | Ex-Mercari | IIT Delhi

    Anthropic has released Claude Opus 4.8, an incremental upgrade over Opus 4.7 at the same price. On the benchmarks: - SWE-Bench Pro: 69.2%, up from Opus 4.7's 64.3%, ahead of GPT-5.5 (58.6%) and Gemini 3.1 Pro (54.2%). - Terminal-Bench 2.1: 74.6%, up from 4.7's 66.1% and above Gemini 3.1 Pro's 70.3%, but behind GPT-5.5 at 78.2% on the same Terminus-2 harness (and 83.4% on its own Codex CLI harness). For coding work the biggest change is honesty. Anthropic reports Opus 4.8 is about four times less likely than its predecessor to let flaws in its own code pass without comment, and more likely to flag uncertainty than claim progress it can't back up. Opus 4.8 defaults to "high" effort, which Anthropic says uses a similar token count to Opus 4.7's default while performing better. The "extra" level (xhigh in Claude Code) and "max" level trade more tokens for better results on hard or long-running async tasks. Pricing is unchanged from Opus 4.7 at $5 per million input tokens and $25 per million output, with fast mode at $10/$50. The model is available via the Claude API as claude-opus-4-8. Anthropic calls this a "modest but tangible" improvement.

    • この画像には代替テキストの説明がありません
  • Agentic Coding Weeklyさんが再投稿しました

    Prashant Anandさんのプロフィールを表示

    Writing Agentic Coding Weekly | Ex-Mercari | IIT Delhi

    Agentic Coding updates from Google I/O ## Antigravity 2.0 and CLI - 2.0 is parallel agent manger similar to Claude and Codex desktop - No IDE inside 2.0 - Antigravity CLI (closed source) will replace Gemini CLI - Gemini CLI will stop working from June 18, 2026 ## Gemini 3.5 Flash - Beats Gemini Gemini 3.1 Pro on almost all benchmarks - $1.5 / $9 per million input / output tokes  - 3x more expensive than 3 Flash  - Pretty close 3.1 Pro pricing of $2 / $12

    • この画像には代替テキストの説明がありません
  • Agentic Coding Weeklyさんが再投稿しました

    Prashant Anandさんのプロフィールを表示

    Writing Agentic Coding Weekly | Ex-Mercari | IIT Delhi

    Codex, Claude, and OpenCode all have slightly different ways to resume, update, and run in yolo mode. This annoys me to no end. But these 12 aliases help me stay sane ``` # codex alias co='codex' alias coy='codex --yolo' alias cor='codex --yolo resume' alias cou='codex update' # claude alias cl='claude' alias cly='claude --dangerously-skip-permissions' alias clr='claude --dangerously-skip-permissions --resume' alias clu='claude update' # opencode alias oc='opencode' alias ocy='opencode --agent yolo' alias ocr='opencode --prompt "/sessions"' alias ocu='opencode upgrade' ```

    • この画像には代替テキストの説明がありません

類似するページ