v3 vision

Status: draft. This is a living document. Suggestions and pushback are welcome via GitHub Issues.

The one-liner

v3 reframes coolify-mcp from "RPC to a remote API" into a live, subscribable, scriptable surface on your infrastructure.

Concretely, v3 adopts three MCP primitives that v2 does not use: Resources, Tasks, and Prompts. It adds a transport option, streamable HTTP, that makes the multi-Coolify use case (#164) straightforward. It also adds tool annotations and outputSchema, which can ship as a v2.12 release independently of the rest.

What's wrong with v2.x

How v3 reshapes the architecture

Per-primitive design

Resources

Shape:

URI	What
`coolify://servers`	List of servers
`coolify://servers/{uuid}`	Single server detail
`coolify://applications`	List of apps
`coolify://applications/{uuid}`	Single app detail
`coolify://applications/{uuid}/deployments`	Recent deployments for an app
`coolify://deployments/{uuid}`	Single deployment detail
`coolify://projects`	Projects
`coolify://databases/{uuid}`	Database detail

Why this is a win:

Clients (especially Claude Desktop) can attach resources to context automatically. The user does not have to say "list my apps" because the app list is already in context.
Resources support subscribe + notification. The client can subscribe to coolify://applications/{uuid} and be notified when status changes. This fits patterns like "deploy this, then tell me when it's healthy."
Resources are application-controlled, not model-controlled. The model can browse them but cannot autonomously trigger side effects.

Migration impact: list_* and get_* tools stay (some clients don't yet support resources). They become wrappers over the resource read. No removals.

Tasks (SEP-1686)

Shape: the long-running tools get a taskSupport: 'optional' or 'required' declaration. Calling them returns a task ID immediately. The client polls or streams progress.

Candidates: deploy, redeploy_project, bulk_env_update, restart_project_apps, stop_all_apps, database_backup.

Why this is a win:

The user can do other things while a deployment runs.
Progress updates ("building... 30%... 60%... healthy") become possible.
Retry semantics: a transient failure does not lose an hour of the user's time.

Migration impact: sync calls still work (declaring taskSupport as optional lets old clients ignore it). New clients get the streaming UX automatically.

Prompts

Candidates:

Slash command	What it does
`/diagnose-app`	Interactive walkthrough of why an app is unhealthy
`/audit-security`	Lists env vars with weak secrets, expiring certs, unused tokens
`/cleanup-stale-previews`	Finds PR preview deployments older than N days, offers to delete
`/setup-environment-clone`	Duplicates prod into staging with sensible defaults
`/promote-staging-to-prod`	Env-var diff + deploy
`/onboard-new-app`	Interactive app creation with sensible defaults

Why this is a win:

Prompts are user-invoked (slash commands in most clients). They surface as discoverable workflows in the client UI instead of requiring the user to phrase the right natural-language request.
They compose multiple tools into deterministic flows, reducing the "did the LLM remember to also..." cognitive load.

Migration impact: purely additive.

Tool annotations (Tier-1 quick win)

Shape: add to every tool descriptor:

typescript

this.tool('list_applications', '...', schema, handler, {
  title: 'Application: List',
  annotations: {
    readOnlyHint: true,
    openWorldHint: false, // it's our own controlled API
  },
});

this.tool('stop_all_apps', '...', schema, handler, {
  title: 'Stop All Applications (DESTRUCTIVE)',
  annotations: {
    destructiveHint: true,
    idempotentHint: true,
  },
});

Annotation	What clients do with it
`readOnlyHint: true`	Skip confirmation prompts
`destructiveHint: true`	Strong confirmation required
`idempotentHint: true`	Safe to retry on transient failures
`openWorldHint: false`	Server has full knowledge of effects (our case)

Migration impact: purely additive. Clients ignore annotations they don't understand.

`outputSchema` + `structuredContent` (Tier-1 quick win)

Shape: every list_* and get_* tool declares the response shape:

typescript

this.tool('list_applications', '...', schema, handler, {
  outputSchema: {
    type: 'object',
    properties: {
      applications: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            uuid: { type: 'string' },
            name: { type: 'string' },
            status: { type: 'string' },
          },
          required: ['uuid', 'name', 'status'],
        },
      },
    },
  },
});

Responses include both content (text, for human readability) and structuredContent (typed JSON, for the LLM to chain reliably).

Why this is a win: the LLM no longer has to re-parse list_* output as text to find a uuid. It gets structuredContent.applications[0].uuid directly. Reduces malformed tool calls when chaining (currently a non-trivial source of failures).

Streamable HTTP transport

Shape: the same coolify-mcp binary can run as coolify-mcp --transport=stdio (current default) or coolify-mcp --transport=http --port=8080. In HTTP mode, the MCP server listens for JSON-RPC over POST / and streams responses via SSE.

Why this is a win: a Coolify operator runs one coolify-mcp instance per Coolify, on the same machine, exposes a URL gated by the existing Coolify auth — any teammate connects with the URL. No token-sharing, no local process spawning. Resolves #164 elegantly: each instance is just a URL.

v2 vs v3 side-by-side

Concern	v2.x (today)	v3.x (proposed)
Tool annotations	None	All tools annotated
Tool output shape	Text blobs	Text + typed `structuredContent`
Long-running ops	Block the transaction	Return task ID, stream progress
Coolify entity access	Per-call `list_` / `get_`	Subscribe-able resources
Workflows	Multi-tool conversations	Prompts as slash commands
Transport	stdio only	stdio + streamable HTTP
Multi-Coolify	Multiple processes	One URL per Coolify
State freshness	Snapshot per call	Push notifications via webhook bridge
Auth	Static token env var	Token or OAuth (when Coolify ships it)

Phasing

Open questions

OAuth vs token. Should v3 require OAuth from day one for the HTTP transport, or accept tokens too? Coolify's OAuth story is still maturing.
Resource granularity. Should coolify://servers be a single subscribe target, or one per server? The latter scales better but is chattier.
Tasks adoption pace. The Tasks primitive is still flagged experimental in the spec. Do we ship v3 with it, or wait for it to go stable?
Backwards compat. v3 is a major bump — but how aggressive on removing v2 tools? Recommendation: keep all v2 tools through the v3.x line, document them as "legacy, prefer resource X" where applicable, remove in v4.
Webhook → notification bridge. Does this need a daemon mode (the MCP process running continuously rather than per-session)? That's a much bigger architecture change.

What's next

Stu approves the broad shape (this doc)
We open an issue per primitive, with the design quoted from here, for community discussion
Pick a starting primitive — likely outputSchema + annotations as a Tier-1 quick win
Land that in v2.12.x, validate the pattern
Tier-2 primitives land progressively, v3.0.0 ships when Resources + Tasks + Prompts are all in

v3 vision ​

The one-liner ​

What's wrong with v2.x ​

How v3 reshapes the architecture ​

Per-primitive design ​

Resources ​

Tasks (SEP-1686) ​

Prompts ​

Tool annotations (Tier-1 quick win) ​

outputSchema + structuredContent (Tier-1 quick win) ​

Streamable HTTP transport ​

v2 vs v3 side-by-side ​

Phasing ​

Open questions ​

What's next ​

v3 vision

The one-liner

What's wrong with v2.x

How v3 reshapes the architecture

Per-primitive design

Resources

Tasks (SEP-1686)

Prompts

Tool annotations (Tier-1 quick win)

`outputSchema` + `structuredContent` (Tier-1 quick win)

Streamable HTTP transport

v2 vs v3 side-by-side

Phasing

Open questions

What's next