Skip to content

v3 vision

Status: draft. This is a living document. Suggestions and pushback are welcome via GitHub Issues.

The one-liner

v3 reframes coolify-mcp from "RPC to a remote API" into a live, subscribable, scriptable surface on your infrastructure.

Concretely, v3 adopts three MCP primitives that v2 does not use: Resources, Tasks, and Prompts. It adds a transport option, streamable HTTP, that makes the multi-Coolify use case (#164) straightforward. It also adds tool annotations and outputSchema, which can ship as a v2.12 release independently of the rest.

What's wrong with v2.x

How v3 reshapes the architecture

Per-primitive design

Resources

Shape:

URIWhat
coolify://serversList of servers
coolify://servers/{uuid}Single server detail
coolify://applicationsList of apps
coolify://applications/{uuid}Single app detail
coolify://applications/{uuid}/deploymentsRecent deployments for an app
coolify://deployments/{uuid}Single deployment detail
coolify://projectsProjects
coolify://databases/{uuid}Database detail

Why this is a win:

  • Clients (especially Claude Desktop) can attach resources to context automatically. The user does not have to say "list my apps" because the app list is already in context.
  • Resources support subscribe + notification. The client can subscribe to coolify://applications/{uuid} and be notified when status changes. This fits patterns like "deploy this, then tell me when it's healthy."
  • Resources are application-controlled, not model-controlled. The model can browse them but cannot autonomously trigger side effects.

Migration impact: list_* and get_* tools stay (some clients don't yet support resources). They become wrappers over the resource read. No removals.

Tasks (SEP-1686)

Shape: the long-running tools get a taskSupport: 'optional' or 'required' declaration. Calling them returns a task ID immediately. The client polls or streams progress.

Candidates: deploy, redeploy_project, bulk_env_update, restart_project_apps, stop_all_apps, database_backup.

Why this is a win:

  • The user can do other things while a deployment runs.
  • Progress updates ("building... 30%... 60%... healthy") become possible.
  • Retry semantics: a transient failure does not lose an hour of the user's time.

Migration impact: sync calls still work (declaring taskSupport as optional lets old clients ignore it). New clients get the streaming UX automatically.

Prompts

Candidates:

Slash commandWhat it does
/diagnose-appInteractive walkthrough of why an app is unhealthy
/audit-securityLists env vars with weak secrets, expiring certs, unused tokens
/cleanup-stale-previewsFinds PR preview deployments older than N days, offers to delete
/setup-environment-cloneDuplicates prod into staging with sensible defaults
/promote-staging-to-prodEnv-var diff + deploy
/onboard-new-appInteractive app creation with sensible defaults

Why this is a win:

  • Prompts are user-invoked (slash commands in most clients). They surface as discoverable workflows in the client UI instead of requiring the user to phrase the right natural-language request.
  • They compose multiple tools into deterministic flows, reducing the "did the LLM remember to also..." cognitive load.

Migration impact: purely additive.

Tool annotations (Tier-1 quick win)

Shape: add to every tool descriptor:

typescript
this.tool('list_applications', '...', schema, handler, {
  title: 'Application: List',
  annotations: {
    readOnlyHint: true,
    openWorldHint: false, // it's our own controlled API
  },
});

this.tool('stop_all_apps', '...', schema, handler, {
  title: 'Stop All Applications (DESTRUCTIVE)',
  annotations: {
    destructiveHint: true,
    idempotentHint: true,
  },
});
AnnotationWhat clients do with it
readOnlyHint: trueSkip confirmation prompts
destructiveHint: trueStrong confirmation required
idempotentHint: trueSafe to retry on transient failures
openWorldHint: falseServer has full knowledge of effects (our case)

Migration impact: purely additive. Clients ignore annotations they don't understand.

outputSchema + structuredContent (Tier-1 quick win)

Shape: every list_* and get_* tool declares the response shape:

typescript
this.tool('list_applications', '...', schema, handler, {
  outputSchema: {
    type: 'object',
    properties: {
      applications: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            uuid: { type: 'string' },
            name: { type: 'string' },
            status: { type: 'string' },
          },
          required: ['uuid', 'name', 'status'],
        },
      },
    },
  },
});

Responses include both content (text, for human readability) and structuredContent (typed JSON, for the LLM to chain reliably).

Why this is a win: the LLM no longer has to re-parse list_* output as text to find a uuid. It gets structuredContent.applications[0].uuid directly. Reduces malformed tool calls when chaining (currently a non-trivial source of failures).

Streamable HTTP transport

Shape: the same coolify-mcp binary can run as coolify-mcp --transport=stdio (current default) or coolify-mcp --transport=http --port=8080. In HTTP mode, the MCP server listens for JSON-RPC over POST / and streams responses via SSE.

Why this is a win: a Coolify operator runs one coolify-mcp instance per Coolify, on the same machine, exposes a URL gated by the existing Coolify auth — any teammate connects with the URL. No token-sharing, no local process spawning. Resolves #164 elegantly: each instance is just a URL.

v2 vs v3 side-by-side

Concernv2.x (today)v3.x (proposed)
Tool annotationsNoneAll tools annotated
Tool output shapeText blobsText + typed structuredContent
Long-running opsBlock the transactionReturn task ID, stream progress
Coolify entity accessPer-call list_* / get_*Subscribe-able resources
WorkflowsMulti-tool conversationsPrompts as slash commands
Transportstdio onlystdio + streamable HTTP
Multi-CoolifyMultiple processesOne URL per Coolify
State freshnessSnapshot per callPush notifications via webhook bridge
AuthStatic token env varToken or OAuth (when Coolify ships it)

Phasing

Open questions

  1. OAuth vs token. Should v3 require OAuth from day one for the HTTP transport, or accept tokens too? Coolify's OAuth story is still maturing.
  2. Resource granularity. Should coolify://servers be a single subscribe target, or one per server? The latter scales better but is chattier.
  3. Tasks adoption pace. The Tasks primitive is still flagged experimental in the spec. Do we ship v3 with it, or wait for it to go stable?
  4. Backwards compat. v3 is a major bump — but how aggressive on removing v2 tools? Recommendation: keep all v2 tools through the v3.x line, document them as "legacy, prefer resource X" where applicable, remove in v4.
  5. Webhook → notification bridge. Does this need a daemon mode (the MCP process running continuously rather than per-session)? That's a much bigger architecture change.

What's next

  1. Stu approves the broad shape (this doc)
  2. We open an issue per primitive, with the design quoted from here, for community discussion
  3. Pick a starting primitive — likely outputSchema + annotations as a Tier-1 quick win
  4. Land that in v2.12.x, validate the pattern
  5. Tier-2 primitives land progressively, v3.0.0 ships when Resources + Tasks + Prompts are all in