If 2025 was the year everyone decided to launch an AI agent, May 2026 is the moment the market started separating models with tools from real agent harnesses.
A harness is the operational layer around the model. It decides how the agent reads instructions, runs tools, asks for approval, manages sessions, persists work, resumes in the background, and reaches humans through code, chat, browser, or APIs. That layer is where practical value is won or lost.
Looking only at public updates published from May 12 through May 26, 2026, three harnesses stand out right now for very different reasons: OpenClaw, Hermes, and Claude Code. They are not converging on one product shape. They are defining three different answers to the same question: what should an agent harness actually optimize for?
Method note
This piece is intentionally time-boxed to the last two weeks.
No old roadmaps, no legacy narratives, and no references to articles published earlier in the year. The goal here is simple: what do the most recent releases and changelog entries say about the state of AI agent harnesses on May 26, 2026?
Fast Snapshot: Who Leads What Right Now?
| Harness | Current best identity | Most important recent signal | Why it matters in May 2026 | Tradeoff |
|---|---|---|---|---|
| OpenClaw | Operator operating system | Stable v2026.5.22 plus two beta releases on May 24 | Fastest visible momentum on cross-channel operations, plugins, onboarding, and performance | Broad surface area means more concepts to master |
| Hermes | Maximum-breadth open harness | v2026.5.16 ships 808 commits, 633 merged PRs, and major runtime changes | It is trying to be the most expansive self-hosted agent surface in the market | Breadth can become sprawl if you want a tighter opinionated workflow |
| Claude Code | Best code-first harness | Six releases from May 19 to May 23, with strong work on agents, permissions, and UX | Still the cleanest terminal-native choice when software delivery is the center of gravity | It is a coding harness first, not a broad operator platform |
My short take is this: OpenClaw has the most interesting operator trajectory, Hermes has the most ambitious surface area, and Claude Code remains the most disciplined product. There is no universal winner, because the category is finally branching into distinct use cases instead of pretending one harness should do everything equally well.
OpenClaw: The Strongest Current Bet for Real Operator Work
OpenClaw's latest stable release, 2026.5.22, followed by beta.1 and beta.2 on May 24, reads like a project that is hardening into a real operating layer for agent work.
The most important recent signals are not flashy model announcements. They are infrastructure signals. The May 22 release focuses heavily on gateway performance, plugin metadata reuse, lazy loading, control UI improvements, onboarding flow, and the internal economics of long-running systems. Those are the updates of a harness that expects people to leave it running, not just demo it.
What changed most recently
- Gateway performance work is everywhere, including cached plugin metadata, lazy-loaded handlers, and reduced startup overhead.
- A new meeting-notes plugin surface landed, with source-provider contracts and Discord voice as the first live source.
- Sub-agent bootstrap context was tightened, keeping persona and memory files out of delegated workers by default.
- The control UI gained search and pagination for session pickers, which matters once the system becomes a daily work surface.
- Embedding providers became a first-class capability, which points to a broader platform view beyond chat and tools alone.
- Classic onboarding now starts when bare
openclawruns without a config, which is a small but smart product move.
That combination makes OpenClaw feel less like a single-purpose CLI and more like an agent operating system for operators. It is especially strong if your real workflow spans Telegram, Discord, cron jobs, browser automation, plugins, approvals, and background follow-through.
The other reason OpenClaw looks strong right now is shipping rhythm. The project had a stable release and multiple betas inside a few days, while the repo itself remained highly active into May 26. That cadence suggests a team that is still aggressively tightening the core loop.
What OpenClaw is best at today
OpenClaw is best when the agent is not just editing code, but operating across surfaces. It wants to live in chats, touch files, call tools, schedule future work, and manage ongoing conversations. That makes it unusually well suited for founder-operator workflows, internal ops assistants, content systems, research pipelines, and any mixed human-plus-automation setup.
The tradeoff is that OpenClaw now has real depth. Depth is good, but it also means there is more to configure, more to understand, and more room for teams to underuse the platform if they only needed a pure coding assistant. If your world is almost entirely Git diffs and local terminal work, OpenClaw can be more system than you strictly need.
Hermes: The Most Ambitious Breadth Play in the Harness Market
Hermes Agent's v2026.5.16 foundation release is one of the biggest single harness updates shipped this month. The release notes are unusually revealing: 808 commits, 633 merged pull requests, 545 issues closed, and 215 contributors since the previous version. That is not a cosmetic release. That is a platform push.
The theme is clear. Hermes wants maximum surface area without forcing users to wire everything by hand. It is chasing the idea that one harness should be able to span providers, chats, tools, browsers, local runtimes, and even other agent clients.
What changed most recently
- xAI Grok via SuperGrok OAuth landed, including a jump to a 1M-token context window for grok-4.3 inside Hermes.
hermes proxycan expose OAuth-backed providers through an OpenAI-compatible local endpoint, which is a clever bridge into Codex, Aider, Cline, Continue, and custom clients.x_searchbecame a first-class X search tool, which expands native information gathering without extra setup.- Microsoft Teams support arrived end-to-end, from auth to webhook listener to outbound delivery.
- The install story got much better, with lighter default installs, more lazy-loaded dependencies, and straight PyPI packaging.
- Performance work is substantial, including roughly 19 seconds off launch time and much faster browser console evaluations.
- Session quality-of-life improved with live
/handoff, cross-session Claude prompt caching, and native clarify buttons on Telegram and Discord.
Hermes right now looks like the most expansive open harness in the field. If your taste runs toward one platform that can speak to many providers, many channels, and many runtime patterns, Hermes is hard to ignore.
I also think Hermes deserves credit for one very smart strategic move: the OpenAI-compatible local proxy. That is bigger than it first appears. It turns Hermes into a compatibility layer, not just a standalone harness. That means Hermes can win even when it is not the interface the human sees.
Where Hermes still feels risky
The same thing that makes Hermes exciting can also make it messy. There is a lot here, and “a lot” is not always the same thing as “clear.” For teams that want the broadest possible experimentation surface, that is fine. For teams that want a narrower opinionated workflow with fewer moving parts, Hermes can feel more like a powerful lab than a quiet product.
That is why I would frame Hermes as the best fit for ambitious self-hosters, agent experimenters, and teams that want one harness to absorb the whole frontier. It is not the simplest choice. It may be the widest one.
Claude Code: Still the Cleanest Coding Harness, and Maybe the Most Disciplined One
Claude Code's latest release, v2.1.150, says only “internal infrastructure improvements.” On its own, that would not make for much analysis. The real story is the release train around it. Between May 19 and May 23, Claude Code shipped a dense cluster of public updates:v2.1.149, v2.1.147, v2.1.145, and others in rapid succession.
Those releases reinforce Claude Code's current identity: it is not trying to be the broadest operator platform. It is trying to be the most reliable, humane, fast-moving code-native harness in the terminal.
What changed most recently
/usagenow breaks down what drives limits usage, including skills, subagents, plugins, and MCP server cost.- Diff navigation and rendering improved, including keyboard-scroll support and better large-edit performance.
- Background agents got stronger, with pinned background sessions, better resume flows, JSON listing via
claude agents --json, and improved wake behavior. - Permissions and sandboxing were tightened repeatedly, including PowerShell path handling, workspace boundary enforcement, and approval analysis fixes.
- The former
/simplifyflow became/code-review, which is a more honest and useful framing. - Plugin discovery matured, with richer details before installation and clearer marketplace browsing.
- Observability improved, including better OTEL span metadata for agents and tools.
That is a very strong signal that Claude Code understands its lane. The product is relentlessly focused on the stuff that actually matters when an agent helps ship software: permission safety, background continuity, diff ergonomics, tool transparency, and fast recovery from weird edge cases.
In other words, Claude Code still feels like the highest-signal choice for developers whose center of gravity is the codebase itself. It is cleaner than Hermes, narrower than OpenClaw, and stronger because of that narrowness.
Why it is not the overall category winner
The limitation is intentional. Claude Code is not trying to become your cross-channel operator shell, your Telegram-native automation layer, or your all-purpose assistant runtime. If you need those things, it is not the strongest fit. But if your question is, “what harness do I trust most inside a serious software workflow right now,” Claude Code remains extremely hard to beat.
The Real State of the Market: Three Different Products Are Emerging
The most interesting thing about these releases is not feature count. It is product divergence.
- OpenClaw is moving toward a multi-surface operator OS.
- Hermes is moving toward a maximally broad, highly compatible, self-hosted agent universe.
- Claude Code is moving toward the most refined code execution harness in the terminal.
That is healthy. The market no longer needs every project to pretend it serves the same user. A founder who lives in Telegram and scheduled follow-ups does not want the same harness as a staff engineer triaging diffs all day. A research-heavy tinkerer building cross-provider flows does not want the same tradeoffs as a product team standardizing one safe coding surface.
There is also a shared pattern across all three. Recent releases increasingly care about background continuity, approval surfaces, tooling transparency, performance, and session management. That is what maturity looks like in this category. The center of gravity has moved from “can the model use tools?” to “can the system stay usable after a few thousand real tasks?”
Comparison Chart: Which Harness Should You Choose Today?
| If your priority is... | Best current fit | Why | Current caution |
|---|---|---|---|
| Cross-channel operator workflows | OpenClaw | It already thinks in chats, tasks, tools, plugins, browser actions, scheduling, and follow-through | Feature breadth means setup discipline matters |
| Maximum experimental breadth | Hermes | The May 16 foundation release expands providers, channels, proxying, installability, and performance all at once | Broadness can reduce clarity for narrower teams |
| Codebase work and git-heavy delivery | Claude Code | Its recent releases keep sharpening background agents, diff ergonomics, permissions, and review flows | Not designed to be the broadest operator shell |
| One harness that doubles as a compatibility layer | Hermes | hermes proxy turns subscribed providers into OpenAI-compatible endpoints for other tools | Power comes with more moving parts |
Runner-Up Harnesses to Keep on the Radar
The top three are not the whole story. Two runner-ups look especially worth watching because their last-week release signals are stronger than the usual “early project” noise.
1. OpenAI Codex
Codex 0.133.0 shipped on May 21, followed almost immediately by several 0.134 alpha releases on May 22 and May 23. The stable release added goals by default, stronger permission profile management, more transparent plugin discovery, better remote-control readiness, and richer extension lifecycle hooks.
That is a serious set of primitives. Codex looks like a harness that is becoming more programmable and more controllable, not just more chatty. I do not put it ahead of Claude Code yet because the product identity still feels more in motion, and the rapid alpha cadence suggests a lot is still settling. But it absolutely belongs on the shortlist for teams who care about terminal-native coding agents with a strong extension story.
2. Gemini CLI
Gemini CLI 0.43.0 and the 0.44.0 preview both landed on May 22. The current signal is not just model branding. The recent changes show the team working on harness fundamentals: steering the model toward surgical edits, improving Auto Memory boundaries, fixing headless OAuth hangs, tightening sandbox behavior, repairing tool-completion races, and adding session export and import support.
Gemini CLI is still a little rougher around the edges than the leaders here, but the direction is credible. It is increasingly obvious that Google wants a real terminal agent product, not just a wrapper around Gemini prompts. If the ergonomics keep improving at this pace, it could become one of the more relevant developer-facing harnesses later this year.
My Actual Ranking, If You Need One
- OpenClaw for real operator workflows that span messaging, tools, scheduling, and execution surfaces.
- Claude Code for teams whose highest-value work is still software delivery inside a codebase.
- Hermes for users who want the widest self-hosted surface and are comfortable managing that scope.
- Codex as the runner-up most likely to climb if its control-plane story keeps improving.
- Gemini CLI as the runner-up with the clearest near-term upside if Google keeps tightening the product loop.
I would not treat that as a universal leaderboard. It is a May 26, 2026 read on harness quality relative to likely use cases. If you are buying for a pure engineering org, you could reasonably put Claude Code first. If you are building a large experimental agent lab, you could reasonably put Hermes first. But for the broadest mix of practical operator work today, I think OpenClaw has the sharpest current position.
Final Take
The agent harness market is finally getting more honest. We are moving past the stage where every project sells “an AI agent” as if that phrase means one thing. It does not.
On May 26, 2026, the state of the market looks like this: OpenClaw is the most compelling operator harness, Hermes is the broadest frontier harness, and Claude Code is still the cleanest coding harness.Codex and Gemini CLI are close enough to matter, but not yet strong enough to dislodge the leaders.
That is good news for buyers and builders. Real categories are forming. And once the categories become real, choosing the right harness gets much easier.
Primary Sources
- OpenClaw 2026.5.22 release notes (published May 24, 2026)
- OpenClaw 2026.5.24-beta.2 release notes (published May 24, 2026)
- Hermes Agent v2026.5.16 release notes (published May 16, 2026)
- Claude Code changelog, especially v2.1.145 through v2.1.150 (published May 19 to May 23, 2026)
- OpenAI Codex 0.133.0 release notes (published May 21, 2026)
- Gemini CLI 0.43.0 release notes (published May 22, 2026)

