pandō_ · blog

Claude Code and MCP servers, a year in: what survived

A year of MCP with Claude Code taught the opposite of 2025's lesson. By 2026 the hard part isn't adding servers — it's knowing which ones to throw away.

Run /context in a Claude Code session you set up months ago and forgot about. Near the top, before your files and the conversation, sits a list of tool definitions: GitHub, Slack, Sentry, a database server, a browser, two servers you added in November and can't reconstruct the reason for. Forty-odd tools, and in most sessions you'll call maybe three of them.

For much of 2025, every one of those tools loaded its full schema into context up front and kept paying that cost through the session. The numbers got ugly fast. In one widely cited measurement, three servers (GitHub, Slack, Sentry) consumed 143,000 of a 200,000-token window before the first user message. Most of the agent's working memory, gone before it read a line of your code.

A year ago that list felt like power. Now it's a common reason a setup feels slower, and somehow worse at the actual work, than it did the week it was installed with nothing connected.

The 2025 tutorials didn't prepare anyone for that. The hard problem with MCP and Claude Code stopped being how to add a server. It's now which ones to take back out.

The reflex that aged badly

In 2025 the advice was sound: Claude Code shipped blind to everything outside your files, and an MCP server gave it eyes. So we added eyes. All of them. Every "best MCP servers" post added three more, and since adding one is a single command, the pile only grew.

The cost showed up in two ways.

The obvious one is context, the schema problem above. The subtler one is selection. A model picks well among four tools and badly among four hundred. One reported benchmark put tool-selection accuracy at 49% with a large toolset loaded up front and 74% once tools were fetched on demand. The extra servers didn't make the agent more capable. Past a point they made it less, and that looked like the model getting dumber when nothing about the model had changed.

What survived

After a year, the config I keep is short, and I'd defend every line of it.

  • Context7, for live, version-correct docs. It tops the popularity lists for an unglamorous reason: it stops the model writing against an API that moved two releases ago. claude mcp add context7 -- npx -y @upstash/context7-mcp@latest
  • The remote GitHub server, for PRs, issues, and reviews from the terminal. Use the hosted endpoint, not the archived npm package: claude mcp add --transport http github https://api.githubcopilot.com/mcp/ --header "Authorization: Bearer YOUR_PAT". Header handling has shifted across Claude Code versions; if the flag is rejected, the add-json form works.
  • Playwright or Chrome DevTools, for a real browser the agent can drive or inspect, so it sees what rendered instead of guessing from the JSX. Both use accessibility snapshots, which costs far fewer tokens than screenshots. claude mcp add playwright -- npx -y @playwright/mcp@latest
  • A read-only database server, so the agent works from the real schema. That is what kills the migration referencing a column you renamed in March. @bytebase/dbhub works; the original reference Postgres server was archived. Keep the role read-only, and watch the token cost on a large schema.

One more category, with a caveat. When a change is structural, a rename across forty call sites or finding every caller of a method, grep-and-patch is where agents quietly break things, and AST-aware editing holds up better. That is the niche pandō fills, for TypeScript, JavaScript, Clojure, and ClojureScript: claude mcp add pando-ai -- npx -y pando-ai. We make it, so discount accordingly. It's also a younger and narrower category than docs, browser, or repo access, and I reach for it only when the edit is genuinely structural.

What I don't run is as telling. Anthropic archived more than half of its original reference servers, the old GitHub and Postgres ones included, so a good share of the "official server" guides online point at unmaintained code. The everything-bundles, the memory servers, the sequential-thinking server that crowns every list: for a frontier model in Claude Code, most of those buy ceremony, not sight. The question I ask before adding anything is whether it lets Claude reach something it otherwise can't. If the answer is no, it's weight.

The real fix was deferral

The change that mattered most arrived while everyone was still arguing about which servers to install.

Claude Code defers tool definitions now. It loads tool names and a short server instruction at startup, then pulls a tool's full schema only when a task needs it, which is the default unless you turn it off or run a setup where it can't apply. On a heavy configuration that cuts the startup token cost dramatically, and the reason a lot of people gave for keeping MCP servers off, that they ate the window, mostly went away.

OpenAI's Codex made the other call, leaning on the model instead of the client — how Codex made the opposite bet, and what it costs you.

A second shift goes further. Rather than calling tools one at a time through the model, the agent can write code that calls them, looping and filtering in a sandbox and handing back only what matters. Anthropic ships this as programmatic tool calling, and it goes after the other half of the bloat: the oversized results tools return.

The third shift isn't about MCP at all. Plenty of what people built servers for was never external data; it was instructions. A "server" that returns the same text on every call (a review checklist, your commit conventions, a deploy runbook) is better off as a skill, a markdown file Claude reads on demand. The working pattern this year runs the other way from last year's: a few servers for real reach, and thin skills for everything that's really just knowledge you want loaded when it's relevant.

The security reckoning

A year also delivered a security education.

Tool poisoning became the term of art, and it's prompt injection wearing a new coat. Claude Code vets a server's tool descriptions once, when you connect. The tool's responses, though, flow into context at runtime with no equivalent check, and the model treats them as trusted. Researchers used exactly that gap in public demonstrations: a poisoned GitHub issue that walked an agent into leaking private-repo contents, a WhatsApp server that siphoned message history through calls that looked routine. On the supply-chain side, a popular remote-connection helper, mcp-remote, shipped a critical RCE (CVE-2025-6514, CVSS 9.6) while sitting in hundreds of thousands of installs.

None of this is an argument for pulling MCP out. Install a server the way you'd add a dependency that can run shell commands and read your context, because that is what it is. Favor first-party servers. Give databases read-only roles. Don't connect a server you can't vouch for to a session that can see your secrets.

Is MCP still the future

Yes, and the tell is that it got boring.

A protocol turns into infrastructure when it slips out of the setup step and into the plumbing. MCP isn't Anthropic's alone anymore. OpenAI's API speaks it, Microsoft wired it into Copilot, Google's tools support it. When the major labs all adopt the same tool protocol, betting against it is the harder position.

What's fading is the ritual around it: the hand-wiring, the dozen servers, the claude mcp add reflex every 2025 post drilled in. The trajectory points at fewer visible tools, loaded on demand, increasingly called through code, with the orchestration pushed somewhere you stop watching it happen.

A year ago the move was to hand your agent everything. The work now is deciding what to take away.

FAQ

Do MCP servers slow down Claude Code? They can. Each connected server's tool definitions consume context, and a heavy setup once ate most of the window before any work began. Claude Code now defers tool loading by default, which largely fixes this, but unused servers still add overhead.

How many MCP servers should I install? Fewer than you think. Most developers get the bulk of the value from three to five, chosen to match their actual workflow. Tool-selection accuracy drops as the toolset grows, so adding servers you rarely use makes the agent worse, not better.

What's the difference between a skill and an MCP server? A skill teaches Claude how to do something and costs almost nothing until used. An MCP server gives Claude live access to a system it can't otherwise reach. If the thing you need is static instructions, it's a skill. If it's external data, it's a server.

Are MCP servers a security risk? They can be. Tool poisoning lets a malicious server inject instructions through its responses, which the model trusts. Use first-party servers, give them least privilege such as read-only database roles, and don't connect servers you can't vouch for to sensitive sessions.

Which MCP servers are worth keeping in 2026? The durable picks are Context7 for live documentation, GitHub for repo work, Playwright or Chrome DevTools for the browser, a read-only database server, and a structural code-editing server. Most others are situational or overhead.

Is MCP still worth using in 2026? Yes. MCP became a cross-vendor standard adopted far beyond Anthropic. The change is that it's turning into plumbing rather than something you fiddle with, which is what adoption looks like when it sticks.