Cloud Executor Reference

Note: The CloudExecutor is a cloud-only component. It is NOT included in the npm package (@wundam/orchex). The npm package contains only the local MCP engine with local LLM executors. Attempting to use cloud mode in the npm package will display: "Cloud mode requires the orchex server. Visit orchex.dev for details."

Complete reference for the Orchex Cloud Executor — the client-side component for submitting and polling cloud orchestration jobs.

Overview

The CloudExecutor class implements the ExecutorStrategy interface and provides:

  1. Job Submission — POST jobs to the cloud server with retry logic
  2. Polling — Adaptive polling with exponential backoff
  3. Error Handling — Automatic retries for transient errors (429, 5xx)
  4. Timeout Management — Configurable timeouts with job cancellation

The cloud executor is used when you want to offload AI execution to the Orchex cloud server instead of running locally.


Class: `CloudExecutor`

// Cloud-only — not available in the npm package
import { CloudExecutor } from './cloud-executor.js';

const executor = new CloudExecutor(
  apiUrl: string,
  apiKey: string,
  config?: Partial<CloudExecutorConfig>
);

Constructor Parameters

Parameter Type Required Description
apiUrl string Yes Cloud server URL (e.g., https://api.orchex.dev)
apiKey string Yes API key for authentication (orchex_sk_...)
config Partial<CloudExecutorConfig> No Override default configuration

Configuration Options

interface CloudExecutorConfig {
  pollIntervalMs: number;      // Base polling interval (default: 1000)
  maxPollIntervalMs: number;   // Max polling interval after backoff (default: 10000)
  timeoutMs: number;           // Total job timeout (default: 660000 = 11 minutes)
  maxRetries: number;          // Max retries for transient errors (default: 5)
  retryBaseDelayMs: number;    // Base delay for exponential backoff (default: 1000)
  retryMaxDelayMs: number;     // Max delay for exponential backoff (default: 30000)
  jitterFactor: number;        // Jitter to prevent thundering herd (default: 0.1)
}
Option Default Description
pollIntervalMs 1000 Initial polling interval (1 second)
maxPollIntervalMs 10000 Maximum polling interval (10 seconds)
timeoutMs 660000 Total timeout (11 minutes — buffer over server's 10min)
maxRetries 5 Maximum retry attempts for transient errors
retryBaseDelayMs 1000 Base delay for exponential backoff
retryMaxDelayMs 30000 Maximum delay cap for backoff (30 seconds)
jitterFactor 0.1 Random jitter factor (±10%) to prevent thundering herd

Method: `execute()`

Execute a stream on the cloud server.

Signature

async execute(request: ExecutionRequest): Promise<ExecutionResult>

ExecutionRequest

interface ExecutionRequest {
  prompt: string;           // Full prompt for the AI
  model: string;            // Model to use (e.g., 'claude-sonnet-4-20250514')
  maxTokens: number;        // Maximum output tokens
  streamId: string;         // Stream identifier for tracking
  timeoutMs?: number;       // Per-request timeout override
  structuredPrompt?: StructuredPrompt;  // Optional: caching hints
}
Field Type Required Description
prompt string Yes Full prompt including context and instructions
model string Yes Anthropic model ID
maxTokens number Yes Maximum output tokens (typically 16384)
streamId string Yes Unique identifier for this stream
timeoutMs number No Override default timeout for this request
structuredPrompt object No Structured prompt with caching hints

ExecutionResult

interface ExecutionResult {
  success: boolean;         // Whether execution succeeded
  rawResponse: string;      // Raw AI response text
  artifact?: StreamArtifact; // Parsed artifact (file operations)
  tokensUsed: {
    input: number;          // Input tokens consumed
    output: number;         // Output tokens consumed
  };
  error?: string;           // Error message if failed
}
Field Type Description
success boolean true if execution completed successfully
rawResponse string Raw response text from the AI model
artifact StreamArtifact Parsed file operations (create, edit, delete)
tokensUsed.input number Input tokens consumed (for cost tracking)
tokensUsed.output number Output tokens consumed
error string Error message if success is false

Usage Example

Basic Usage

// Cloud-only — not available in the npm package
import { CloudExecutor } from './cloud-executor.js';

const executor = new CloudExecutor(
  'https://api.orchex.dev',
  process.env.ORCHEX_API_KEY!
);

const result = await executor.execute({
  prompt: 'Create a hello world function in src/hello.ts',
  model: 'claude-sonnet-4-20250514',
  maxTokens: 16384,
  streamId: 'hello-world'
});

if (result.success) {
  console.log('Files changed:', result.artifact?.filesChanged);
  console.log('Tokens used:', result.tokensUsed);
} else {
  console.error('Execution failed:', result.error);
}

With Custom Configuration

const executor = new CloudExecutor(
  'https://api.orchex.dev',
  process.env.ORCHEX_API_KEY!,
  {
    timeoutMs: 300000,        // 5 minute timeout
    maxRetries: 3,            // Fewer retries
    pollIntervalMs: 2000,     // Start polling at 2s
    maxPollIntervalMs: 5000,  // Cap polling at 5s
  }
);

With Per-Request Timeout

// Long-running task needs longer timeout
const result = await executor.execute({
  prompt: longComplexPrompt,
  model: 'claude-sonnet-4-20250514',
  maxTokens: 16384,
  streamId: 'complex-refactor',
  timeoutMs: 900000  // 15 minutes for this specific request
});

Execution Flow

┌─────────────────┐
│   Submit Job    │
│  POST /execute  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Poll Status   │◄──────────────┐
│  GET /job/:id   │               │
└────────┬────────┘               │
         │                        │
    ┌────┴────┐                   │
    │ Status? │                   │
    └────┬────┘                   │
         │                        │
    ┌────┼────┬────────┐          │
    │    │    │        │          │
    ▼    ▼    ▼        ▼          │
 pending running completed failed │
    │    │       │        │       │
    └────┴───────┼────────┘       │
         │       │                │
         │       ▼                │
         │  Return Result         │
         │                        │
         └──────(wait)────────────┘

Retry Behavior

The executor automatically retries on transient errors:

Status Code Behavior
429 Respects Retry-After header, backs off polling interval
500 Exponential backoff with jitter
502 Exponential backoff with jitter
503 Exponential backoff with jitter
504 Exponential backoff with jitter
400, 401, 403, 404 Immediate failure (non-retryable)

Exponential Backoff Formula

delay = min(baseDelay * 2^attempt, maxDelay) ± jitter

Example with defaults:

  • Attempt 0: 1000ms ± 100ms
  • Attempt 1: 2000ms ± 200ms
  • Attempt 2: 4000ms ± 400ms
  • Attempt 3: 8000ms ± 800ms
  • Attempt 4: 16000ms ± 1600ms
  • Attempt 5: 30000ms ± 3000ms (capped)

Timeout Handling

When a job times out:

  1. Job Cancellation — The executor sends a cancel request to stop further token consumption
  2. Token Usage Retrieval — Fetches final token usage for cost tracking
  3. Error Return — Returns failure with timeout error message
// Timeout result structure
{
  success: false,
  rawResponse: '',
  tokensUsed: { input: 1234, output: 567 },  // Tokens used before timeout
  error: 'Cloud execution timed out after 660000ms (job cancelled)'
}

Artifact Extraction

The executor extracts artifacts from the AI response by looking for:

\`\`\`orchex-artifact
{
  "streamId": "hello-world",
  "status": "complete",
  "operations": [...],
  "filesChanged": ["src/hello.ts"],
  "summary": "Created hello world function"
}
\`\`\`

If the server provides a pre-parsed artifact, that is used. Otherwise, the client extracts and validates the artifact from the raw response.


Cloud API Endpoints

The CloudExecutor communicates with these server endpoints:

POST /api/v1/execute

Submit a new job for execution.

Request:

{
  "streamId": "hello-world",
  "prompt": "Create a hello world function...",
  "model": "claude-sonnet-4-20250514",
  "maxTokens": 16384,
  "timeoutMs": 660000
}

Response:

{
  "jobId": "job_abc123def456"
}

GET /api/v1/job/:id

Get job status and result.

Response (pending/running):

{
  "status": "running"
}

Response (completed):

{
  "status": "completed",
  "output": "...",
  "artifact": {...},
  "tokensUsed": { "input": 1000, "output": 500 }
}

Response (failed):

{
  "status": "failed",
  "error": "Error message"
}

POST /api/v1/job/:id/cancel

Cancel a running job.

Response:

{
  "cancelled": true
}

Best Practices

Configuration

Use appropriate timeouts for your workload

// Quick tasks: shorter timeout
{ timeoutMs: 120000 }  // 2 minutes

// Complex refactoring: longer timeout
{ timeoutMs: 600000 }  // 10 minutes

Reduce polling for batch jobs

// Batch processing: start with longer intervals
{
  pollIntervalMs: 5000,
  maxPollIntervalMs: 15000
}

Error Handling

Always check success status

const result = await executor.execute(request);
if (!result.success) {
  // Handle failure
  console.error(result.error);
  // Check tokensUsed for cost tracking even on failure
}

Handle specific error types

if (result.error?.includes('timed out')) {
  // Consider retrying with longer timeout
} else if (result.error?.includes('rate limit')) {
  // Back off and retry later
} else if (result.error?.includes('401')) {
  // Check API key
}

Cost Optimization

Track token usage

let totalTokens = { input: 0, output: 0 };

for (const stream of streams) {
  const result = await executor.execute(stream);
  totalTokens.input += result.tokensUsed.input;
  totalTokens.output += result.tokensUsed.output;
}

console.log(`Total tokens: ${totalTokens.input} in, ${totalTokens.output} out`);

Use structured prompts for caching (reduces costs by up to 90%)

const result = await executor.execute({
  ...request,
  structuredPrompt: {
    systemContent: '...',      // Highly cacheable
    projectContext: '...',     // Cacheable within orchestration
    streamContext: '...',      // Cacheable per stream
    taskContent: '...',        // Not cached
    fullPrompt: '...',
    cachingHints: [...]
  }
});

See Also


Last updated: February 2026 • Orchex v1.0.0-rc.1