MCP Server Performance: What 39.9 Million Requests Say About Language Choice
After reading TM Dev Lab's benchmark across 15 implementations, I think defaulting to Python for MCP servers is a mistake for production. Java and Go are in a different tier.
Everyone building AI agents defaults to Python for MCP servers. After reading through TM Dev Lab's benchmark — 39.9 million requests across 15 implementations — I think that's a mistake for production.
What They Tested
Streamable HTTP transport (the current MCP standard), k6 load testing, 50 concurrent virtual users, Docker containers capped at 1 CPU and 1GB memory. Four workload types: Fibonacci (CPU), external fetch (I/O), JSON transformation, and database simulation. Three independent test rounds. The core four implementations:
| Language | Avg Latency | RPS | Memory | CPU at load |
|---|---|---|---|---|
| Java (Spring Boot) | 0.835ms | 1,624 | 226MB | 30% |
| Go | 0.855ms | 1,624 | 18MB | 28% |
| Node.js | 10.66ms | 559 | 110MB | 93% |
| Python (FastMCP) | 26.45ms | 292 | 98MB | 99% |
Zero errors across all requests. All four are reliable. The question is performance and what it costs you.
Two Tiers, Not Four
Java and Go are in one category. Node.js and Python are in another.
Java and Go both hit 1,624 RPS at sub-millisecond latency. They ran at 28–30% CPU under full load — headroom to scale. Go's throughput variability across three rounds was 0.5%, Java's was 0.7%. Tight.
Node.js peaked at 559 RPS and ran at 93% CPU. Python managed 292 RPS at 99% CPU. Both were saturated. No headroom. Python's variability was 9%, with an 8% throughput drop in round 2.
The gap between tiers isn't incremental. It's 3–5x on RPS, 13–30x on latency.
The Go Case
Go and Java are statistically tied on throughput and latency. 0.835ms vs 0.855ms is noise. Where Go wins is memory: 18MB vs 226MB. That's 12.8x better memory efficiency at equivalent performance. In Kubernetes, more replicas per node. In cost-sensitive deployments, real money at scale. The benchmark measured 92.6 RPS/MB for Go vs 7.2 RPS/MB for Java.
I build AI agents with Spring AI and Spring Boot. The Java numbers are solid and I'm not switching stacks. But for a new MCP service with no existing Java infrastructure, the Go argument is hard to dismiss.
Python's Actual Numbers
I expected Python to be slower. I didn't expect 84x.
CPU-bound Fibonacci: Java at 0.37ms, Go at 0.39ms, Python at 30.83ms. I/O fetch: Go at 1.29ms, Python at 80.92ms — 61x slower. The CPU gap is the GIL. Python's Global Interpreter Lock means even threaded Python runs one thread at a time. FastMCP on single-worker uvicorn saturated at 99% CPU while delivering 292 RPS. Java and Go handled 1,624 RPS at 30% CPU.
Multi-worker uvicorn and uvloop help. But you're still fighting the GIL on CPU-bound work. And most people don't tune their MCP server setup past the default config.
What This Means Practically
For prototyping and internal tools: Python is fine. Iteration speed beats runtime performance when you're testing whether an MCP tool is worth building.
For production at real load: Python and Node.js will hit their ceiling. Node.js at 93% CPU saturation has no room for traffic spikes. Python at 26ms average adds latency to every tool call your agent makes.
Go is the rational production choice without existing Java infrastructure. Same performance, a fraction of the memory, best consistency numbers in the test. The tradeoff: smaller AI/ML library ecosystem than Python.
Java makes sense if you already run Java services. Spring AI integration is solid, the JVM handles the load well, and 226MB is a fine tradeoff when Spring AI is doing the heavy lifting.
Sources: TM Dev Lab Benchmark v2 · v1 baseline · GitHub: benchmark-mcp-servers