Documentation
v2.2.0 • Production Ready
Getting Started
Platform Overview
MCP Prompt Optimizer Pro is a professional AI prompt engineering platform with three components: the Cloud Edition (full-featured web platform), the Chrome Extension (browser-based optimization on any AI site), and two MCP Servers — a cloud-connected CLI and a fully local air-gapped server.
Optimization Pipeline:
• Context Detection — Identifies your target AI context (code generation, image prompts, LLM chat, human communication) and tailors optimization accordingly.
• Sophistication Analysis — Measures prompt complexity to route to the right optimization tier.
• Three-Tier Routing — Rules-based (fast, sub-50ms), hybrid (rules + LLM), or full-LLM optimization, selected automatically per request.
• Parameter Preservation — Retains technical flags, style parameters, model settings, and structured values through optimization.
• Smart Clarification — Asks targeted follow-up questions when intent is ambiguous before optimizing.
Compatible MCP clients: Claude Desktop, Cursor, Windsurf, and any MCP-compatible client.
Getting Started
Quick Start
Follow these steps to connect to the Cloud Edition.
Install CLI
Install the MCP Prompt Optimizer globally via npm to make it available to your AI clients.
npm install -g mcp-prompt-optimizerMCP Server — Local Edition (Air-Gapped)
1. Get a local API key from your dashboard at https://promptoptimizer.xyz/local-license.
2. Set the environment variable: export OPTIMIZER_API_KEY="sk-local-basic-your-key"
3. Install: npm install -g mcp-prompt-optimizer-local
4. Add to your MCP client config (e.g., Claude Desktop claude_desktop_config.json):
{
"mcpServers": {
"prompt-optimizer-local": {
"command": "mcp-optimizer",
"args": [],
"env": {
"OPTIMIZER_API_KEY": "sk-local-basic-your-key"
}
}
}
}5. Verify: mcp-optimizer --health
Getting Started
Chrome Extension
The Chrome Extension brings optimization directly to any text input field in your browser. Optimize prompts on ChatGPT, Claude, Midjourney, or any AI platform without leaving your workflow.
Key Capabilities:
• Universal Integration — Works on any website with text inputs.
• Cloud-Powered — Connects to the Cloud Edition backend for full optimization capability.
• Model Selection — Choose from available AI models per optimization.
• One-Click Optimization — Floating ✨ button next to any input.
• Quota Tracking — Real-time usage stats in the extension popup.
Installation:
1. Chrome Web Store (Coming Soon).
2. Manual (Developer Mode): Download from GitHub, enable Developer Mode in chrome://extensions, then load unpacked.
Getting Started
Local MCP Server
mcp-prompt-optimizer-local is a fully local prompt optimization server. All 120+ optimization rules run on-device — no LLM calls, no data egress.
Key Properties:
• Completely local — no network calls during optimization.
• Air-gapped ready — works without internet after initial install.
• Binary validation — SHA256 checksums verified on install.
• Cross-platform — Windows x64, macOS (Intel + Apple Silicon), Linux x64/ARM64.
• Free tier — 5 daily optimizations on sk-local-basic-* keys; unlimited on sk-local-pro-*.
After installing with npm install -g mcp-prompt-optimizer-local, the binary is available as `mcp-optimizer`.
Check health and license:
mcp-optimizer --healthRun a quick test:
mcp-optimizer --testFeatures
Templates
Create, manage, and share reusable prompt templates with full version history and runtime delivery.
Template Features:
• AI-aware filtering — Browse by context type (code, image, chat) and sophistication level.
• Version history — Every update creates a snapshot; restore any prior version with one call.
• Publish/draft workflow — draft → published → archived lifecycle, with development, staging, and production environment targets.
• Slug-based access — Auto-generated URL-safe slug (e.g., product-writer-a3f9c21b) for runtime delivery.
• Webhook notifications — HMAC-SHA256 signed POST fires to your endpoint on every update. Verify with X-Signature-256: sha256=<hex>.
• Variable interpolation — Use {{variable_name}} placeholders compiled at delivery time.
Key endpoints:
• GET /api/v1/templates — list with filters.
• POST /api/v1/templates — create; slug auto-generated from title.
• GET /api/v1/templates/{id}/versions — version history.
• GET /api/v1/templates/{id}/versions/{n} — fetch a specific snapshot.
• POST /api/v1/templates/{id}/rollback/{n} — restore from snapshot.
• POST /api/v1/templates/{id}/publish — set state=published.
• POST /api/v1/templates/{id}/draft — set state=draft.
Features
Context Engineer
The Context Engineer turns any goal, document, or workflow description into a full agentic scaffold — SOPs, skill packages, task graphs, and deployable orchestrator code.
What it generates:
• SOP Document — step-by-step standard operating procedure with decision trees and error handling.
• Skill Package — SKILL.md, reference docs, usage examples, helper scripts, and prompt templates, bundled as a deployable ZIP.
• Task Graph — DAG of SOP steps for visual workflow inspection and dependency tracking.
• Orchestrator Code — production-ready Python coordinator for multi-agent pipelines.
• Security Audit — OWASP Agentic Top 10 analysis against your generated scaffold.
• GCC Memory — long-horizon memory store with create, branch, commit, merge, and search operations.
Exploration Mode (Enterprise):
Generates three independent SOP variants from different angles, then blends them into an optimized final document.
Key endpoints under `/api/v1/context-engineer`:
• POST /sop — generate SOP from goal + context.
• POST /skill-package — full deployable skill bundle.
• POST /task-graph — DAG visualization data.
• POST /orchestrator — Python orchestrator code.
• POST /security-audit — OWASP agentic security analysis.
• POST /harness-bundle — complete deployment ZIP.
Features
Model Selection
Access and configure the model catalog used for optimization and evaluation.
Catalog endpoints:
• GET /api/v1/models — full catalog (cloud models via OpenRouter and direct providers).
• GET /api/v1/models/live/free — currently available free models, ranked.
• GET /api/v1/models/evaluation/free — free models suitable for evaluation.
User-level preference:
Set a preferred optimization model in your account settings. The backend resolves the model in this order: (1) per-request custom_model_id, (2) your saved user preference, (3) the system default.
Anonymous/free-tier:
Anonymous and free-tier accounts use a separate cost-free model. Paid tiers unlock the full catalog including frontier models.
Set preference:
PUT /api/v1/user/settings
{
"preferred_optimization_model": "openai/gpt-4o-mini",
"preferred_evaluation_model": "google/gemini-2.5-flash-lite"
}Features
Evaluation v2.0
Professional prompt evaluation with datasets, batch scoring, statistical comparison, and calibration.
Evaluation Workflow:
1. Build a dataset — create test cases manually or auto-generate from your optimization history.
2. Run evaluation — score prompts against test cases using LLM judges.
3. Compare — statistical comparison between two prompt variants.
4. Calibrate — align scoring to ground-truth labels.
Quick evaluate (no dataset required):
POST /api/v1/evaluations/quick-evaluate — stateless one-shot scoring with actionable LLM-generated feedback.
Dataset management:
• POST /api/v1/evaluations/datasets — create dataset.
• POST /api/v1/evaluations/datasets/{id}/test-cases — add test cases.
• POST /api/v1/evaluations/datasets/automate/from-history — auto-build from past optimizations.
Scoring and analysis:
• POST /api/v1/evaluations/evaluate/batch — batch evaluate against a dataset.
• POST /api/v1/evaluations/compare — statistically compare two prompts.
• POST /api/v1/evaluations/calibrate — calibrate model scoring to ground truth.
Features
Value Hierarchy & IntentFrame
Shape how the optimizer prioritizes competing goals and communicates user intent.
Value Hierarchy:
Assign priority labels to optimization directives. Labels in order of precedence: NON_NEGOTIABLE, HIGH, MEDIUM, LOW. Directives are injected as a DIRECTIVES block in the LLM system prompt, and high-priority labels floor the routing tier so critical constraints always get full-LLM optimization quality.
IntentFrame:
A sub-model that captures your perspective, out-of-scope constraints, and success definition. Helps the optimizer understand context not explicit in the prompt text.
Example request body:
{
"prompt": "Write a product announcement email",
"value_hierarchy": [
{"label": "NON_NEGOTIABLE", "directive": "Never use markdown formatting"},
{"label": "HIGH", "directive": "Keep under 150 words"}
],
"intent_frame": {
"perspective": "B2B SaaS product manager",
"out_of_scope": "Technical implementation details",
"success_definition": "Gets approved and sent in under 5 minutes"
}
}API & Integration
API Reference
Direct API access for custom integrations. All endpoints require authentication unless noted.
Base URL: https://p01--project-optimizer--fvmrdk8m9k9j.code.run
Authentication:
Authorization: Bearer sk-your-api-key-hereCore optimization endpoints:
• POST /api/v1/optimize — optimize a prompt; returns optimized text + metadata.
• POST /api/v1/optimize/stream — SSE streaming; yields progress and final result.
• POST /api/v1/optimize/clarify/{request_id} — respond to a clarification question.
• GET /api/v1/optimize/quota — current quota usage and tier limits.
• POST /api/v1/optimize/feedback — submit quality feedback for an optimization result.
Minimal request body:
{
"prompt": "Write a product description for a wireless keyboard",
"goals": ["clarity", "effectiveness"]
}Error response format:
{
"detail": "Quota exceeded",
"error_code": "QUOTA_EXCEEDED",
"retry_after": 3600
}API & Integration
Prompt Delivery API
Deliver compiled prompt templates to any consumer at runtime using URL-safe slugs.
How it works:
1. Create a template in your library. A slug is auto-generated (e.g., product-writer-a3f9c21b).
2. Store the slug in your application or pipeline.
3. At runtime, fetch and compile the template with your variable values.
Fetch template + variable schema:
GET /api/v1/prompts/{slug}Returns the template body and a list of required {{variable}} placeholders.
Compile with variables:
POST /api/v1/prompts/{slug}/compiledBody: { "variables": { "product": "wireless keyboard", "tone": "professional" } }
Returns the fully interpolated prompt string. Missing required variables return a 422.
Security:
Variable interpolation runs in a sandboxed Jinja2 environment. Block/comment tags are disabled to prevent CPU amplification. StrictUndefined surfaces missing variables as a structured error.
API & Integration
Upload & Batch
Upload prompt files for parsing, or run batch optimization on multiple prompts in a single request.
Upload and parse:
POST /api/v1/upload/parse
Upload a markdown, text, or blueprint document. The parser extracts prompt content and preserves structure. Whole-file mode (default) prevents fragmentation of multi-section documents.
Batch optimize:
POST /api/v1/upload/batch-optimize
Optimize multiple prompts in one call. Accepts parsed file output or an explicit list. Returns one optimized result per input.
Download results:
• POST /api/v1/download — download an optimization result as a file.
• POST /api/v1/download-simple — simplified download with minimal metadata.
API & Integration
Real-Time Streaming
The AG-UI (Antigravity UI) layer provides WebSocket-based real-time optimization with human-in-the-loop decision points.
Endpoints:
• GET /api/config — AG-UI feature flags and rollout status.
• WS /ws — WebSocket connection for streaming optimization sessions.
What streaming enables:
• Live context updates — see context detection and routing decisions as they happen.
• Human-in-the-loop — optimizer pauses at key decision points for your approval or redirect.
• Progressive rendering — optimized prompt streams in as it is generated.
SSE alternative:
POST /api/v1/optimize/stream uses server-sent events for streaming without a persistent WebSocket connection. Suitable for one-shot optimizations that need progress visibility.
Feature gating:
Context streaming and human decision points are controlled by rollout flags. Check GET /api/config to see which features are active for your account tier.
Teams & Access
Teams
Teams allow multiple users to share a quota pool and collaborate with shared API keys.
Team management:
• POST /api/v1/teams — create a team.
• GET /api/v1/teams/{id}/overview — team stats, member count, and quota summary.
• GET /api/v1/teams/{id}/members — list members.
• DELETE /api/v1/teams/{id}/members/{user_id} — remove a member.
• PATCH /api/v1/teams/{id}/settings — update team name and configuration.
• GET /api/v1/teams/{id}/quota-status — current shared quota usage.
Invitations:
• POST /api/v1/teams/{id}/invitations — send an invitation by email.
• GET /api/v1/teams/invitations/{token} — look up an invitation.
• POST /api/v1/teams/invitations/{token}/accept — accept an invitation.
Team API keys:
• GET /api/v1/teams/{id}/api-keys — list team keys.
• POST /api/v1/teams/{id}/api-keys — create a team key.
• DELETE /api/v1/teams/{id}/api-keys/{key_id} — revoke a key.
Teams & Access
Authentication & API Keys
API keys authenticate all requests. User keys are per-account; team keys are shared across a team.
User key management:
• GET /api/v1/api-keys — list your active keys.
• POST /api/v1/api-keys — generate a new key.
• DELETE /api/v1/api-keys/{key_id} — revoke a key.
Header format:
Authorization: Bearer sk-your-key-hereBoth user keys and team keys are accepted in the same header. The backend resolves quota and permissions through a unified key service.
Quota status:
GET /api/v1/optimize/quota returns your current usage:
{
"used": 42,
"limit": 500,
"tier": "pro",
"reset_at": "2026-07-01T00:00:00Z"
}Free tier:
New accounts start on the free tier with a monthly quota. Quotas are checked before every optimization request and return a 429 when exceeded.
Explore the Infrastructure
Ready to integrate the professional standard in prompt engineering? Start builds in minutes with our comprehensive SDKs and CLI tools.