Short answer: MCP is a small protocol that lets an AI agent ask a trusted tool for structured context instead of guessing. For GraphQL, that means the agent stops hallucinating fields and starts writing queries that actually exist in your schema.
If you've ever watched Cursor or Claude Code write a GraphQL query against an API it has never seen, you know the problem. The agent pattern-matches from its training data, makes up a field called user.avatarUrl, and confidently hands you a query that fails at runtime. The fix isn't a smarter model. The fix is giving the model a tool that knows your schema.
What MCP actually is
MCP, the Model Context Protocol, is an open spec for how an AI client (Cursor, Claude Desktop, Codex, Windsurf, ChatGPT's desktop app) talks to a context server. The client handles the user and the model. The server handles the knowledge. They exchange JSON-RPC messages over stdio or HTTP.
There are three things an MCP server can expose:
- Tools: functions the agent can call.
search_schema,introspect_type,build_query. - Resources: read-only data the agent can pull in. A specific type definition, a schema version.
- Prompts: pre-written instructions the user can invoke by name.
That's it. No new embedding format, no proprietary client. Whatever the agent is, if it speaks MCP, it can use your server.
Why GraphQL specifically benefits
A REST API has a URL per endpoint, usually well documented, usually narrow. A GraphQL API is the opposite: one endpoint, thousands of possible shapes. Introspection helps, but dumping a 4 MB SDL into an agent's context is the wrong answer on three axes:
- It blows the context window long before the agent sees your actual question.
- It makes the agent read the same schema on every turn.
- It teaches nothing about which fields are relevant to the task at hand.
We've measured this on real GQLens workspaces. A mid-size federated graph has between 18,000 and 45,000 lines of SDL. Even Claude Opus with a 200k context starts degrading when you fill 30% of it with raw types. Accuracy on field-selection questions drops by roughly a third in our internal evals once SDL crosses 60k tokens.
MCP fixes this by flipping the direction. Instead of the agent reading the whole schema, the agent asks: "Which types represent a billing invoice?" The MCP server does vector search across type descriptions, returns the five most relevant types, and the agent uses those to build the query.
But aren't there already GraphQL MCP servers?
Yes, a handful. Most of them solve a different problem. They wrap a GraphQL endpoint so the agent can POST operations to it: execution as a service. Useful if the agent already knows exactly what to send. Not useful when the agent is staring at a 30,000-line schema for the first time and has to guess which fields exist.
The install story is the other half of the gap. A typical existing GraphQL MCP expects you to clone a repo, edit a config, hand-wire the endpoint URL, paste in auth headers, and run a local process per graph. That works for one developer poking at one API. It breaks down the moment a team has three graphs, rotating tokens, or someone who doesn't live in the terminal.
GQLens inverts both pieces. The first problem the agent needs to solve on an unfamiliar API is "what does this graph look like?", not "how do I POST to it." So we specialise in schema understanding: introspection, chunking, embeddings, semantic lookup, field-level expansion, query compilation. Execution, when it needs to happen at all, is usually better handled by the application's own GraphQL client, behind its own auth, with its own observability.
Put simply: most existing GraphQL MCPs help the agent call your API. GQLens helps the agent understand it.
Three patterns that actually work
After a year of running GQLens against production graphs, these are the three patterns that separate agents that write correct queries from ones that keep hallucinating:
1. Semantic type lookup over full introspection
Don't give the agent __schema. Give it search_schema(query: "invoice fields for billing screen") that returns ranked types with descriptions. The difference in accuracy is larger than any prompt-engineering trick we've tested.
2. Scoped field expansion
When the agent picks a type, expose a second tool: describe_type(name: "Invoice") that returns fields, their types, deprecation notes, and (critically) examples of how other queries have selected from this type. The examples anchor the agent to patterns your team already uses.
3. Query compilation, not query generation
The final step shouldn't be "here's a string, good luck parsing it." It should be a tool that takes a structured intent, compiles to GraphQL, validates against the schema, and returns a ready-to-execute document. Fewer parsing errors, fewer rounds of "oops, that field doesn't exist."
How GQLens implements this
Every GraphQL source you add to GQLens is introspected, chunked at the type boundary, and embedded into a Qdrant collection per workspace. The MCP server exposes three tools aligned with the patterns above: search_schema, describe_type, and build_query.
A real tool call from a Cursor session last week looked like this:
User: "Show me all active subscriptions with their current plan name."
Agent calls: search_schema({ query: "active subscription current plan" })
→ returns Subscription, BillingPlan, SubscriptionStatus enum
Agent calls: describe_type({ name: "Subscription" })
→ returns fields: id, status, currentPlan { id name }, customer { id }
Agent calls: build_query({
operation: "query",
root: "subscriptions",
filter: { status: "ACTIVE" },
select: ["id", "currentPlan.name"]
})
→ returns compiled, validated query
Three tool calls. No hallucinated fields. Runs on the first try.
When MCP is the wrong tool
MCP is not a magic bridge. Don't reach for it when:
- The agent needs mutation side effects on your data. That belongs behind a proper API with auth, not an MCP tool.
- You need streaming partial results. MCP is request/response today.
- The knowledge is volatile and must be re-read every second. The protocol is not built for that.
It is the right tool when the context is large, structured, mostly read-only, and shared across multiple agents or users.
Next steps
If you run a GraphQL API and your team uses Cursor, Claude Code, or any MCP-compatible client, pick one graph to start. Put it behind an MCP server that speaks the three patterns above. Measure the rate of queries your agents write that compile on the first try, before and after.
That number, first-try compile rate, is the one to watch. It's the single clearest signal of whether an AI agent has real context on your API or is just guessing.