Cloud Executor Reference
Note: The
CloudExecutoris a cloud-only component. It is NOT included in the npm package (@wundam/orchex). The npm package contains only the local MCP engine with local LLM executors. Attempting to use cloud mode in the npm package will display: "Cloud mode requires the orchex server. Visit orchex.dev for details."
Complete reference for the Orchex Cloud Executor — the client-side component for submitting and polling cloud orchestration jobs.
Overview
The CloudExecutor class implements the ExecutorStrategy interface and provides:
- Job Submission — POST jobs to the cloud server with retry logic
- Polling — Adaptive polling with exponential backoff
- Error Handling — Automatic retries for transient errors (429, 5xx)
- Timeout Management — Configurable timeouts with job cancellation
The cloud executor is used when you want to offload AI execution to the Orchex cloud server instead of running locally.
Class: `CloudExecutor`
// Cloud-only — not available in the npm package
import { CloudExecutor } from './cloud-executor.js';
const executor = new CloudExecutor(
apiUrl: string,
apiKey: string,
config?: Partial<CloudExecutorConfig>
);Constructor Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
apiUrl |
string | Yes | Cloud server URL (e.g., https://api.orchex.dev) |
apiKey |
string | Yes | API key for authentication (orchex_sk_...) |
config |
Partial<CloudExecutorConfig> |
No | Override default configuration |
Configuration Options
interface CloudExecutorConfig {
pollIntervalMs: number; // Base polling interval (default: 1000)
maxPollIntervalMs: number; // Max polling interval after backoff (default: 10000)
timeoutMs: number; // Total job timeout (default: 660000 = 11 minutes)
maxRetries: number; // Max retries for transient errors (default: 5)
retryBaseDelayMs: number; // Base delay for exponential backoff (default: 1000)
retryMaxDelayMs: number; // Max delay for exponential backoff (default: 30000)
jitterFactor: number; // Jitter to prevent thundering herd (default: 0.1)
}| Option | Default | Description |
|---|---|---|
pollIntervalMs |
1000 | Initial polling interval (1 second) |
maxPollIntervalMs |
10000 | Maximum polling interval (10 seconds) |
timeoutMs |
660000 | Total timeout (11 minutes — buffer over server's 10min) |
maxRetries |
5 | Maximum retry attempts for transient errors |
retryBaseDelayMs |
1000 | Base delay for exponential backoff |
retryMaxDelayMs |
30000 | Maximum delay cap for backoff (30 seconds) |
jitterFactor |
0.1 | Random jitter factor (±10%) to prevent thundering herd |
Method: `execute()`
Execute a stream on the cloud server.
Signature
async execute(request: ExecutionRequest): Promise<ExecutionResult>ExecutionRequest
interface ExecutionRequest {
prompt: string; // Full prompt for the AI
model: string; // Model to use (e.g., 'claude-sonnet-4-20250514')
maxTokens: number; // Maximum output tokens
streamId: string; // Stream identifier for tracking
timeoutMs?: number; // Per-request timeout override
structuredPrompt?: StructuredPrompt; // Optional: caching hints
}| Field | Type | Required | Description |
|---|---|---|---|
prompt |
string | Yes | Full prompt including context and instructions |
model |
string | Yes | Anthropic model ID |
maxTokens |
number | Yes | Maximum output tokens (typically 16384) |
streamId |
string | Yes | Unique identifier for this stream |
timeoutMs |
number | No | Override default timeout for this request |
structuredPrompt |
object | No | Structured prompt with caching hints |
ExecutionResult
interface ExecutionResult {
success: boolean; // Whether execution succeeded
rawResponse: string; // Raw AI response text
artifact?: StreamArtifact; // Parsed artifact (file operations)
tokensUsed: {
input: number; // Input tokens consumed
output: number; // Output tokens consumed
};
error?: string; // Error message if failed
}| Field | Type | Description |
|---|---|---|
success |
boolean | true if execution completed successfully |
rawResponse |
string | Raw response text from the AI model |
artifact |
StreamArtifact | Parsed file operations (create, edit, delete) |
tokensUsed.input |
number | Input tokens consumed (for cost tracking) |
tokensUsed.output |
number | Output tokens consumed |
error |
string | Error message if success is false |
Usage Example
Basic Usage
// Cloud-only — not available in the npm package
import { CloudExecutor } from './cloud-executor.js';
const executor = new CloudExecutor(
'https://api.orchex.dev',
process.env.ORCHEX_API_KEY!
);
const result = await executor.execute({
prompt: 'Create a hello world function in src/hello.ts',
model: 'claude-sonnet-4-20250514',
maxTokens: 16384,
streamId: 'hello-world'
});
if (result.success) {
console.log('Files changed:', result.artifact?.filesChanged);
console.log('Tokens used:', result.tokensUsed);
} else {
console.error('Execution failed:', result.error);
}With Custom Configuration
const executor = new CloudExecutor(
'https://api.orchex.dev',
process.env.ORCHEX_API_KEY!,
{
timeoutMs: 300000, // 5 minute timeout
maxRetries: 3, // Fewer retries
pollIntervalMs: 2000, // Start polling at 2s
maxPollIntervalMs: 5000, // Cap polling at 5s
}
);With Per-Request Timeout
// Long-running task needs longer timeout
const result = await executor.execute({
prompt: longComplexPrompt,
model: 'claude-sonnet-4-20250514',
maxTokens: 16384,
streamId: 'complex-refactor',
timeoutMs: 900000 // 15 minutes for this specific request
});Execution Flow
┌─────────────────┐
│ Submit Job │
│ POST /execute │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Poll Status │◄──────────────┐
│ GET /job/:id │ │
└────────┬────────┘ │
│ │
┌────┴────┐ │
│ Status? │ │
└────┬────┘ │
│ │
┌────┼────┬────────┐ │
│ │ │ │ │
▼ ▼ ▼ ▼ │
pending running completed failed │
│ │ │ │ │
└────┴───────┼────────┘ │
│ │ │
│ ▼ │
│ Return Result │
│ │
└──────(wait)────────────┘Retry Behavior
The executor automatically retries on transient errors:
| Status Code | Behavior |
|---|---|
429 |
Respects Retry-After header, backs off polling interval |
500 |
Exponential backoff with jitter |
502 |
Exponential backoff with jitter |
503 |
Exponential backoff with jitter |
504 |
Exponential backoff with jitter |
400, 401, 403, 404 |
Immediate failure (non-retryable) |
Exponential Backoff Formula
delay = min(baseDelay * 2^attempt, maxDelay) ± jitterExample with defaults:
- Attempt 0: 1000ms ± 100ms
- Attempt 1: 2000ms ± 200ms
- Attempt 2: 4000ms ± 400ms
- Attempt 3: 8000ms ± 800ms
- Attempt 4: 16000ms ± 1600ms
- Attempt 5: 30000ms ± 3000ms (capped)
Timeout Handling
When a job times out:
- Job Cancellation — The executor sends a cancel request to stop further token consumption
- Token Usage Retrieval — Fetches final token usage for cost tracking
- Error Return — Returns failure with timeout error message
// Timeout result structure
{
success: false,
rawResponse: '',
tokensUsed: { input: 1234, output: 567 }, // Tokens used before timeout
error: 'Cloud execution timed out after 660000ms (job cancelled)'
}Artifact Extraction
The executor extracts artifacts from the AI response by looking for:
\`\`\`orchex-artifact
{
"streamId": "hello-world",
"status": "complete",
"operations": [...],
"filesChanged": ["src/hello.ts"],
"summary": "Created hello world function"
}
\`\`\`If the server provides a pre-parsed artifact, that is used. Otherwise, the client extracts and validates the artifact from the raw response.
Cloud API Endpoints
The CloudExecutor communicates with these server endpoints:
POST /api/v1/execute
Submit a new job for execution.
Request:
{
"streamId": "hello-world",
"prompt": "Create a hello world function...",
"model": "claude-sonnet-4-20250514",
"maxTokens": 16384,
"timeoutMs": 660000
}Response:
{
"jobId": "job_abc123def456"
}GET /api/v1/job/:id
Get job status and result.
Response (pending/running):
{
"status": "running"
}Response (completed):
{
"status": "completed",
"output": "...",
"artifact": {...},
"tokensUsed": { "input": 1000, "output": 500 }
}Response (failed):
{
"status": "failed",
"error": "Error message"
}POST /api/v1/job/:id/cancel
Cancel a running job.
Response:
{
"cancelled": true
}Best Practices
Configuration
✓ Use appropriate timeouts for your workload
// Quick tasks: shorter timeout
{ timeoutMs: 120000 } // 2 minutes
// Complex refactoring: longer timeout
{ timeoutMs: 600000 } // 10 minutes✓ Reduce polling for batch jobs
// Batch processing: start with longer intervals
{
pollIntervalMs: 5000,
maxPollIntervalMs: 15000
}Error Handling
✓ Always check success status
const result = await executor.execute(request);
if (!result.success) {
// Handle failure
console.error(result.error);
// Check tokensUsed for cost tracking even on failure
}✓ Handle specific error types
if (result.error?.includes('timed out')) {
// Consider retrying with longer timeout
} else if (result.error?.includes('rate limit')) {
// Back off and retry later
} else if (result.error?.includes('401')) {
// Check API key
}Cost Optimization
✓ Track token usage
let totalTokens = { input: 0, output: 0 };
for (const stream of streams) {
const result = await executor.execute(stream);
totalTokens.input += result.tokensUsed.input;
totalTokens.output += result.tokensUsed.output;
}
console.log(`Total tokens: ${totalTokens.input} in, ${totalTokens.output} out`);✓ Use structured prompts for caching (reduces costs by up to 90%)
const result = await executor.execute({
...request,
structuredPrompt: {
systemContent: '...', // Highly cacheable
projectContext: '...', // Cacheable within orchestration
streamContext: '...', // Cacheable per stream
taskContent: '...', // Not cached
fullPrompt: '...',
cachingHints: [...]
}
});See Also
- API Overview — Complete API reference
- MCP Tools — MCP tool reference
- Stream Definitions — Stream schema reference
- Error Handling Guide — Error handling best practices
Last updated: February 2026 • Orchex v1.0.0-rc.1