Build an MCP Server from Scratch:
A Complete Developer Guide

Protocol fundamentals · Tools · Resources · Prompts · HTTP transport · Production deploy

Build an MCP Server from scratch developer guide
Who this is for: Developers who know LLMs can call tools but hit walls wiring each model to each API by hand. What you get: A complete path from zero to a production-ready MCP Server—protocol background, FastMCP Hello World, five real tools with Pydantic schemas, Resources and Prompts, HTTP streamable transport with auth, Docker deploy, and a ChromaDB knowledge-base capstone. Structure: evolution and architecture (s1–s2), hands-on implementation (s3), HTTP/debug/six-step runbook (s4), production hosting and 2026 ecosystem data (s5).
01

Why MCP exists: Function Calling, Plugins, and the integration wall

LLMs are strong reasoners but weak executors. The first fix was OpenAI Function Calling (June 2023): define JSON schemas, let the model emit structured calls, your code executes them. It worked—but only inside OpenAI's API. Every other model needed a separate adapter.

ChatGPT Plugins (March 2023) tried a marketplace model: each plugin was a bespoke REST wrapper with its own manifest. Discovery was centralized in OpenAI's store, not runtime-dynamic. When Anthropic shipped Tool Use and LangChain popularized agent frameworks, the industry had three incompatible tool schemas and no portable server layer.

01

Function Calling era: Per-vendor schemas. GPT, Claude, and Gemini each expect different message shapes. Switch models → rewrite the tool bridge.

02

Plugin era: Centralized discovery in one store, not in your agent runtime. No standard for Resources or Prompts—only callable endpoints.

03

Framework era: LangChain Tools, CrewAI tools, and IDE-specific hooks each define their own tool objects. Definitions cannot travel across Cursor, VS Code, and Claude Desktop.

04

N×M cost: N AI clients × M backends = N×M integrations. Enterprise CRM teams maintain parallel adapter layers for every LLM vendor.

05

MCP answer (Nov 2024): Anthropic open-sourced a single protocol—JSON-RPC 2.0 over STDIO or HTTP—where Servers self-describe Tools, Resources, and Prompts. Write once; Cursor, Claude Desktop, ChatGPT, Gemini, and VS Code connect without rewriting.

Anthropic's design rationale mirrors USB-C: standardize the port, not every cable. MCP does not replace REST APIs—it wraps them behind a uniform discovery and invocation layer so AI clients stop caring which vendor or framework sits on the other side. By Q2 2026 governance moved to the Linux Foundation AAIF; OpenAI, Google, and Microsoft all ship native MCP client support.

Function Calling solved "how does one model call one function?" MCP solves "how does any AI client discover and invoke any tool server?"—the question every agent stack faces in 2026.

02

MCP architecture: Client/Server, three capabilities, and transport lifecycle

MCP runs on JSON-RPC 2.0 with bidirectional messaging. A Host (Cursor, Claude Desktop) embeds one or more MCP Clients. Each Client maintains a 1:1 session with an MCP Server that bridges to real systems.

Servers expose three capability types: Tools (executable actions with side effects), Resources (read-only data identified by URIs), and Prompts (reusable template strings the client can pre-fill). Core RPC methods include tools/list, tools/call, resources/list, resources/read, prompts/list, and prompts/get.

JSON-RPC 2.0
{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "query_database",
    "arguments": { "sql": "SELECT id, email FROM users LIMIT 10" }
  },
  "id": 42
}

STDIO transport: Host spawns Server as a subprocess; messages flow over stdin/stdout. Zero network deps, strong process isolation, ideal for local dev. Lifecycle: spawn → initialize handshake → capability negotiation → persistent session → subprocess exit on disconnect.

HTTP + SSE / streamable-http: Server runs as a long-lived HTTP service. Client opens SSE stream for server-push events; tool calls go via POST. Enables remote deploy, team-shared servers, horizontal scaling—but requires session affinity for SSE and proper auth at the edge. The 2026 spec adds streamable-http as the recommended remote transport, replacing legacy SSE-only patterns.

DimensionOpenAI Function CallingLangChain ToolsMCP
ScopeOpenAI API onlyPython agent frameworksAny MCP Host (Cursor, Claude, ChatGPT, VS Code)
DiscoveryStatic: dev hard-codes function list in API callStatic: tools registered in Python at startupDynamic: tools/list at session init
Self-descriptionJSON Schema in API payloadPydantic / dict schemas in codeJSON Schema returned by Server; same for Resources and Prompts
Data accessNo standard read-only layerCustom retrievers per frameworkResources with URI scheme (file://, db://)
Reusable templatesSystem prompts onlyPromptTemplate objectsPrompts capability with argument schemas
TransportHTTPS to OpenAIIn-process Python callsSTDIO subprocess or HTTP streamable remote
PortabilityVendor-lockedFramework-lockedWrite Server once; any client connects
03

Dev setup and implementation: FastMCP, Tools, Resources, and Prompts

Pick your SDK stack: Python with mcp + FastMCP for fastest iteration, or TypeScript with @modelcontextprotocol/sdk for Node-native HTTP servers. Recommended project layout:

project structure
my-mcp-server/
  pyproject.toml
  src/
    server.py          # FastMCP entry
    tools/             # tool modules
    resources/         # resource handlers
    prompts/           # prompt templates
  tests/
    test_tools.py
  Dockerfile

Install: pip install mcp or npm install @modelcontextprotocol/sdk. Debug with MCP Inspector (npx @modelcontextprotocol/inspector)—connects to STDIO or HTTP endpoints and shows live tools/list output. Wire into Cursor via Settings → MCP → add server config; Claude Desktop via claude_desktop_config.json.

Python · FastMCP Hello World
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("hello-mcp")

@mcp.tool()
def greet(name: str) -> str:
    return f"Hello, {name}!"

if __name__ == "__main__":
    mcp.run()

Tools with Pydantic input: FastMCP derives JSON Schema from type hints and Pydantic models automatically. Below: five practical tools—calculator, file I/O, HTTP fetch, DB query, and time—with async support and structured error returns.

Python · five tools
import asyncio
import httpx
from datetime import datetime, timezone
from pathlib import Path
from pydantic import BaseModel, Field
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("utility-server")

class CalcInput(BaseModel):
    expression: str = Field(description="Math expression e.g. 2 + 3 * 4")

@mcp.tool()
def calculate(input: CalcInput) -> str:
    try:
        allowed = set("0123456789+-*/(). ")
        if not all(c in allowed for c in input.expression):
            return "Error: invalid characters in expression"
        return str(eval(input.expression, {"__builtins__": {}}, {}))
    except Exception as e:
        return f"Error: {e}"

@mcp.tool()
def read_file(path: str) -> str:
    try:
        target = Path(path).resolve()
        if not target.is_file():
            return f"Error: {path} is not a file"
        return target.read_text(encoding="utf-8", errors="replace")[:50000]
    except Exception as e:
        return f"Error: {e}"

@mcp.tool()
async def http_get(url: str) -> str:
    try:
        async with httpx.AsyncClient(timeout=15.0) as client:
            resp = await client.get(url)
            return resp.text[:10000]
    except Exception as e:
        return f"Error: {e}"

@mcp.tool()
async def db_query(sql: str) -> str:
  try:
    import aiosqlite
    async with aiosqlite.connect("app.db") as db:
      cursor = await db.execute(sql)
      rows = await cursor.fetchall()
      return str(rows[:100])
  except Exception as e:
    return f"Error: {e}"

@mcp.tool()
def current_time(tz: str = "UTC") -> str:
    return datetime.now(timezone.utc).isoformat()

Resources: Static URIs map to fixed files; dynamic URIs accept parameters. A filesystem resource server exposes project docs the agent can read without tool side effects.

Python · filesystem resources
from mcp.server.fastmcp import FastMCP
from pathlib import Path

mcp = FastMCP("fs-resources")
DOCS = Path("./docs")

@mcp.resource("docs://index")
def docs_index() -> str:
    files = [f.name for f in DOCS.glob("*.md")]
    return "\n".join(files)

@mcp.resource("docs://{filename}")
def read_doc(filename: str) -> str:
    path = DOCS / filename
    if not path.exists():
        return f"Error: {filename} not found"
    return path.read_text(encoding="utf-8")

Prompts: Reusable templates with argument slots. A code-review prompt pre-fills context so the agent follows your team's review checklist.

Python · code review prompt
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("review-prompts")

@mcp.prompt()
def code_review(file_path: str, language: str) -> str:
    return f"""Review {file_path} ({language}) for:
1. Security vulnerabilities (injection, auth bypass)
2. Performance bottlenecks
3. Error handling gaps
4. Test coverage holes
Provide severity-rated findings with fix suggestions."""

Cursor config snippet: Add to .cursor/mcp.json{"mcpServers": {"utility": {"command": "python", "args": ["src/server.py"]}}}. Claude Desktop uses the same shape in claude_desktop_config.json.

04

HTTP transport, debugging, and the six-step MCP Server runbook

Streamable HTTP transport is the 2026 default for remote MCP Servers. Run FastMCP with mcp.run(transport="streamable-http") or use the TypeScript SDK's StreamableHTTPServerTransport. Layer security at the edge:

01

Bearer tokens: Validate Authorization: Bearer <token> on every request. Rotate tokens via env vars, never hard-code.

02

API keys: Accept X-API-Key header for simpler client configs. Map keys to per-tenant rate limits.

03

CORS: Restrict Access-Control-Allow-Origin to known Host origins. Wildcard * on production MCP endpoints is a common misconfiguration.

04

Rate limiting: Cap requests per key at the reverse proxy (nginx, Cloudflare) or in-app middleware. MCP tool calls can trigger expensive downstream APIs.

Debugging workflow: MCP Inspector first, then client logs, then unit tests. Inspector shows raw JSON-RPC, schema validation errors, and timing per call.

Python · pytest unit test
import pytest
from server import calculate, CalcInput

def test_calculate_valid():
    result = calculate(CalcInput(expression="2 + 3"))
    assert result == "5"

def test_calculate_invalid_chars():
    result = calculate(CalcInput(expression="import os"))
    assert result.startswith("Error:")
Error symptomLikely causeFix
Server not listed in CursorConfig path or command wrongVerify mcp.json command/args; check subprocess starts without import errors
tools/list returns emptyDecorators not registered before run()Import tool modules in server.py entry point
HTTP connection refusedWrong port or transport mismatchMatch client transport to server mode (STDIO vs streamable-http)
CORS error in browser clientMissing or overly strict headersAdd allowed origins; enable credentials only when needed
Tool call timeoutSync blocking in async contextUse async def tools or asyncio.to_thread for I/O
Invalid JSON Schema rejectedPydantic model missing Field descriptionsAdd Field(description=...) for agent-readable params

Six-step runbook from assessment to production-ready MCP Server:

01

Map integrations: List external systems (DB, Git, Slack, internal APIs). Count current per-model adapters and estimate rewrite cost on vendor switch.

02

Choose SDK and transport: Python FastMCP for STDIO prototyping; add streamable-http when teammates need remote access. TypeScript if your stack is Node.

03

Implement capabilities: Start with 2–3 Tools, one Resource URI scheme, one Prompt template. Validate schemas in MCP Inspector before wiring clients.

04

Wire clients: Add server to Cursor mcp.json and Claude Desktop config. Confirm tools/list on session start—no hard-coded tool arrays in prompts.

05

Test and harden: Unit tests per tool, integration tests via Inspector, structured error strings instead of raw tracebacks to the LLM.

06

Deploy 24/7 host: Dockerize, push to Railway/Render/Cloud Run or a dedicated VPS. For Apple Silicon inference and Xcode CI on the same node, use a cloud Mac Mini instead of a Linux VPS.

05

Production deploy, capstone project, ecosystem, and hard numbers

Production paths: Package with Docker (python:3.12-slim base, non-root user, health check on HTTP port). Platform options:

A

Railway / Render: Push container, set env vars for API keys, expose HTTP port. Good for team-shared dev/staging servers.

B

AWS Lambda: Works for stateless tool calls only—avoid for persistent SSE sessions. Use API Gateway + short-lived handlers.

C

Google Cloud Run: Scales HTTP streamable servers; set min instances to 1 for session warmth.

D

VPS / bare metal: Full control for long-running agents, local embeddings, and multi-server stacks on one host.

Monitoring: Export request latency and tool-call counters to Prometheus; ship exceptions to Sentry with MCP method name tags. Pin SDK versions in pyproject.toml / package.json and tag Docker images per release—MCP spec revisions in 2026 still ship breaking transport changes.

Capstone: personal knowledge base MCP. Combine ChromaDB vector store, embedding model (local Ollama or API), and semantic search tool:

Python · ChromaDB search tool
import chromadb
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("knowledge-base")
client = chromadb.PersistentClient(path="./chroma_data")
collection = client.get_or_create_collection("notes")

@mcp.tool()
def semantic_search(query: str, n: int = 5) -> str:
    results = collection.query(query_texts=[query], n_results=n)
    docs = results.get("documents", [[]])[0]
    return "\n---\n".join(docs) if docs else "No matches found."

@mcp.tool()
def ingest_note(text: str, source: str) -> str:
    collection.add(documents=[text], metadatas=[{"source": source}], ids=[source])
    return f"Indexed: {source}"

2026 ecosystem servers to study: mcp-server-filesystem (official reference), GitHub MCP (repo/PR/issue ops), Brave Search MCP (web retrieval), Postgres MCP (SQL with schema introspection), Slack MCP (channel messaging). Trends for H2 2026: OAuth 2.1 standardized auth, unified server registries, multi-tenant hosted MCP marketplaces, and Agent-to-Agent (A2A) orchestration sitting above MCP's tool layer.

A

10,000+ community servers: MCP server count exceeded ten thousand by mid-2026—each new server instantly reaches every compatible client.

B

38–55% integration cost drop: Enterprise teams report 38–55% lower AI integration spend after consolidating adapters behind MCP Servers (industry survey average, 2025–2026).

C

~1,000 exposed unauthorized servers: Security scans in early 2026 found roughly one thousand MCP endpoints on the public internet without auth—never deploy production Servers without Bearer tokens, API keys, and network isolation.

Heads up: Linux VPS nodes cannot run Xcode builds or Apple Silicon-optimized local embedding models. Laptop STDIO sessions die on sleep. AWS Lambda cannot hold persistent MCP SSE sessions. Each shortcut trades away something production agents need.

For teams running MCP Servers alongside Cursor agents, ChromaDB ingestion, and iOS CI on one always-on node, MESHLAUNCH cloud Mac Mini rental is usually the better production host: dedicated Apple Silicon, no sleep disconnects, flexible daily/weekly/monthly billing, and a stable home where tool definitions become portable team assets instead of per-developer laptop configs. See pricing for node specs.

FAQ

Python FastMCP is fastest for prototyping: decorator-based tools, automatic JSON Schema from type hints, and one-line STDIO launch. TypeScript @modelcontextprotocol/sdk suits Node.js teams or HTTP streamable servers on Express. Both are official SDKs—pick the language your team ships in. Cloud node specs on the pricing page.

Yes. STDIO servers run as subprocesses on any macOS host. HTTP streamable servers need a persistent process and stable network—cloud Mac Mini bare metal gives you Apple Silicon for local embedding inference, Xcode CI on the same node, and no laptop sleep killing stateful MCP sessions.

Start MCP Inspector against your server endpoint, verify tools/list returns expected schemas, then check Cursor MCP logs. Common causes: mismatched transport mode, invalid JSON Schema, subprocess crash on import, or CORS blocking HTTP servers. SSH tunnel and port setup in the help center.