What is an MCP server and how does it work?
An MCP server exposes your tools and data to AI clients over a standard protocol. We break down the architecture, transports, tools, and how a request actually flows end to end.
An MCP server is a program that exposes tools and data to AI clients over the Model Context Protocol. It sits between an AI assistant and your actual systems — receiving structured requests from the model, calling your API or database, and returning results the model can use. This guide breaks down what an MCP server contains, how a request flows through it, and what it takes to run one in production.
- →An MCP server advertises a list of tools, then executes them when an AI client calls.
- →It speaks JSON-RPC over a transport — usually stdio for local tools or HTTP/SSE for hosted ones.
- →The server, not the AI client, holds the credentials to your upstream API.
- →Running one in production means handling auth, scaling, logging, and rate limits.
- →A hosted platform removes that operational burden — you point it at your API and get a URL.
What an MCP server actually does
At its core, an MCP server answers two kinds of questions from a client. First, "what can you do?" — the server returns a list of tools, each with a name, a human-readable description, and an input schema. Second, "please do this" — the client sends a tool name plus arguments, the server executes it, and returns the result. The AI model uses the descriptions to decide which tool to call and the schema to format the arguments correctly.
The anatomy of a request
Here's the end-to-end path of a single tool call against a hosted MCP server:
- The user asks the AI client to do something ("list my most recent orders").
- The model picks a tool — say
listOrders— and fills in its arguments. - The client sends a JSON-RPC
tools/callrequest to the server's URL. - The server maps that tool to a real API endpoint (
GET /orders), injecting auth. - The upstream API responds; the server returns the result to the client.
- The model reads the result and writes a natural-language answer.
Note where the credentials live: the server holds the API key or OAuth token and adds it to the upstream call. The AI client never sees your secret — it only knows the public MCP URL.
Transports: how clients reach the server
stdio (local)
The server runs as a subprocess on the same machine as the AI client and communicates over standard input and output. This is perfect for personal, single-user tools but doesn't scale to many users or run in the cloud.
HTTP + SSE (hosted)
The server listens at a network URL and streams responses using Server-Sent Events. This is how multi-user, always-on MCP servers work. Clients connect with a snippet like the one below, where the URL is the server's public endpoint:
{
"mcpServers": {
"orders-api": {
"command": "npx",
"args": [
"-y",
"mcp-remote@latest",
"https://mcp.getcast.io/orders-api-cmpx12ab34"
]
}
}
}What it takes to run one in production
Writing a basic server with an official SDK is straightforward. Operating one reliably is the part teams underestimate. A production MCP server needs to handle:
- Authentication to your upstream API — API keys, bearer tokens, or OAuth flows, stored securely.
- Tool selection — exposing only safe operations and hiding destructive ones.
- Scaling and uptime — staying available as more clients connect.
- Observability — logging every tool call so you can debug and audit.
- Rate limiting — protecting your API from runaway agent loops.
Build it yourself or host it?
If you want full control and have the engineering time, the open-source SDKs let you build a server from scratch. If you'd rather skip the infrastructure, a hosted platform like Cast turns an OpenAPI spec into a running server: it generates the tools, manages auth encryption, provisions the endpoint, and records every call — so you focus on which capabilities to expose, not on keeping a service alive.
Turn your API into an MCP server
Upload an OpenAPI spec, configure auth, and get a live MCP endpoint in minutes — no infrastructure to manage.
Try Cast freeFrequently asked questions
What's the difference between an MCP server and an MCP client?
Can an MCP server connect to any API?
Is an MCP server the same as a REST API?
How do I keep an MCP server secure?