Cloud Execution Guide
Note: Cloud execution features require an orchex.dev account and are NOT available in the local npm package (
@wundam/orchex). The npm package provides local MCP orchestration using your own API keys (BYOK). See orchex.dev/pricing for cloud plans.
This guide helps you decide when to use cloud vs local execution, manage quotas and rate limits, and optimize costs.
When to Use Cloud vs Local
Use Cloud Execution When:
✅ Large or Complex Tasks
- Tasks with multiple streams (5+ concurrent operations)
- Long-running operations (>30 minutes)
- Tasks requiring high parallelism
- Complex workflows with many dependencies
✅ Team Collaboration
- Multiple team members need to view progress
- Sharing execution history and results
- Coordinating work across different time zones
- Reviewing team performance metrics
✅ Resource Constraints
- Limited local compute resources
- Poor local network connectivity
- Running on low-powered devices
- Need to preserve local battery life
✅ CI/CD Integration
- Automated deployments and testing
- Scheduled maintenance tasks
- Continuous integration workflows
- Infrastructure automation
Use Local Execution When:
✅ Quick Iterations
- Small, fast tasks (<5 minutes)
- Simple single-file changes
- Exploratory work and prototyping
- Testing manifest configurations
✅ Privacy & Security
- Working with sensitive code or data
- Compliance requirements for data locality
- Private repositories without cloud access
- Air-gapped or restricted environments
✅ Cost Sensitivity
- Frequent small tasks (add up on cloud)
- Budget constraints
- Personal/hobby projects
- Learning and experimentation
✅ Offline Work
- No internet connectivity
- Unreliable network conditions
- VPN or firewall restrictions
Decision Matrix
| Factor | Cloud | Local |
|---|---|---|
| Task Duration | >10 min | <10 min |
| Streams | 3+ | 1-2 |
| Team Size | 2+ | Solo |
| Budget | Available | Limited |
| Privacy | Standard | Sensitive |
| Network | Stable | Limited |
| Review Needs | High | Low |
Quota Management
Understanding Your Quotas
Cloud quotas are based on your subscription tier:
See orchex.dev/pricing for current limits. Summary:
- Local (Free): 5 streams, 2 waves, single provider, BYOK
- Pro ($19/mo): 100 runs/mo, 15 agents, 10 waves, 2 providers,
learn, self-healing - Team ($49/user/mo): 500 runs/mo, 25 agents, 25 waves, 3 providers, shared orchestrations
- Enterprise (Custom): Unlimited, self-hosted, SLA, dedicated support
Checking Quota Usage
# View current quota status
orchex cloud quota
# Example output:
# Executions: 45/100 (45%)
# This month: 45 executions
# Remaining: 55 executions
# Resets: Feb 28, 2026Quota Best Practices
1. Monitor Usage Regularly
# Check usage before large operations
orchex cloud quota --json | jq '.remaining'
# Set up alerts (Pro/Team tiers)
orchex cloud quota alert --threshold 802. Batch Similar Tasks
# Instead of 10 small executions:
# Run 1 execution with 10 streams
streams:
- id: update-component-1
prompt: "..."
- id: update-component-2
prompt: "..."
# ... 8 more3. Use Local for Small Tasks
# Single file changes - use local
orchex execute -f manifest.yaml
# Large refactoring - use cloud
orchex execute -f manifest.yaml --cloud4. Share Team Quota Wisely
# Coordinate with team members
orchex cloud usage --team
# Reserve quota for critical tasks
# Use local for experimentsRate Limits
Current Rate Limits
See orchex.dev/pricing for current rate limits by tier.
Handling Rate Limits
1. Automatic Retry with Backoff
Orchex automatically retries rate-limited requests:
// Automatic behavior:
// - First retry: 1 second
// - Second retry: 2 seconds
// - Third retry: 4 seconds
// - Fail after 3 retries2. Spread Out Submissions
# Instead of submitting 5 executions at once:
for manifest in manifests/*.yaml; do
orchex execute -f "$manifest" --cloud
sleep 60 # Wait 1 minute between submissions
done3. Use Stream Parallelism
# Better: 1 execution with 10 streams
# Than: 10 executions with 1 stream each
streams:
- id: task-1
prompt: "..."
- id: task-2
prompt: "..."
# More streams = better parallelism4. Monitor Rate Limit Headers
# Check rate limit status
orchex cloud limits
# Output:
# Rate Limit: 45/60 per minute
# Reset: 15 seconds
# Burst Available: 15 requestsCost Optimization
Understanding Costs
Execution Pricing:
- Small execution (1-3 streams, <10 min): ~$0.50
- Medium execution (4-10 streams, 10-30 min): ~$2.00
- Large execution (10+ streams, 30+ min): ~$5.00+
Cost Drivers:
- Number of streams (parallelism)
- Execution duration
- Token usage (input + output)
- Storage for history and artifacts
Optimization Strategies
1. Right-Size Your Streams
Too Many Small Streams (Expensive):
streams:
- id: fix-typo-1
prompt: "Fix typo in file1.ts"
- id: fix-typo-2
prompt: "Fix typo in file2.ts"
- id: fix-typo-3
prompt: "Fix typo in file3.ts"
# Cost: 3 streams × overhead = $$$Better: Combined Stream (Cheaper):
streams:
- id: fix-typos
prompt: |
Fix typos in the following files:
- file1.ts
- file2.ts
- file3.ts
# Cost: 1 stream = $2. Optimize Context Size
# Expensive: Include everything
context:
- "**/*" # Sends entire codebase
# Better: Only what's needed
context:
- "src/components/**/*.ts"
- "src/types/component.ts"
- "package.json"
# Reduces token costs by 70-90%3. Use Local for Iterations
# First attempt - local (free)
orchex execute -f manifest.yaml
# If successful, done!
# If needs tweaking, fix manifest locally
# Final run - cloud (for team/history)
orchex execute -f manifest.yaml --cloud4. Leverage Caching
# Enable intelligent caching
settings:
cache_context: true # Reuse context across streams
deduplicate_files: true # Skip unchanged files5. Set Budget Limits
# Set per-execution budget
orchex execute -f manifest.yaml --cloud --max-cost 5.00
# Set monthly budget (Pro/Team)
orchex cloud budget set 100.00
# Get budget alerts
orchex cloud budget alert --threshold 806. Clean Up Old Executions
# Delete old execution history (saves storage)
orchex cloud cleanup --older-than 90days
# Keep only successful executions
orchex cloud cleanup --failed-onlyCost Comparison Examples
Example 1: Simple Bug Fix
# Local: $0 (free)
# Cloud: $0.50
# Recommendation: Use local
streams:
- id: fix-validation-bug
prompt: "Fix the email validation regex"Example 2: Feature Implementation
# Local: $0 (but ties up your machine for 30min)
# Cloud: $2.50 (parallel execution, team visibility)
# Recommendation: Use cloud
streams:
- id: api-endpoint
prompt: "Implement POST /api/users endpoint"
- id: api-tests
prompt: "Add tests for user endpoint"
- id: api-docs
prompt: "Update API documentation"
- id: frontend-form
prompt: "Create user registration form"Example 3: Large Refactoring
# Local: $0 (but 2+ hours, blocks other work)
# Cloud: $5.00 (parallel, doesn't block local work)
# Recommendation: Use cloud
streams:
- id: migrate-db
prompt: "Migrate database schema"
- id: update-models
prompt: "Update all model files"
- id: update-controllers
prompt: "Update controller logic"
- id: update-views
prompt: "Update view components"
- id: update-tests
prompt: "Update test suites"Monitoring and Alerts
Dashboard Monitoring
# Open cloud dashboard
orchex cloud dashboard
# View in terminal
orchex cloud statusKey Metrics to Watch:
- Execution success rate
- Average execution time
- Cost per execution
- Quota utilization
- Rate limit hits
Setting Up Alerts
# Quota alerts
orchex cloud alert quota --threshold 80 --email you@company.com
# Budget alerts
orchex cloud alert budget --threshold 90 --slack #orchex-alerts
# Failure alerts
orchex cloud alert failures --consecutive 3Hybrid Workflows
Best of Both Worlds
Combine local and cloud execution for optimal results:
Development Workflow:
# 1. Prototype locally (fast iteration)
orchex execute -f manifest.yaml
# 2. Refine manifest based on results
# Edit manifest.yaml
# 3. Final execution in cloud (team visibility)
orchex execute -f manifest.yaml --cloudCI/CD Workflow:
# .github/workflows/deploy.yml
name: Deploy Feature
on:
pull_request:
branches: [main]
jobs:
local-validation:
runs-on: ubuntu-latest
steps:
- name: Validate manifest
run: orchex validate manifest.yaml
cloud-execution:
needs: local-validation
runs-on: ubuntu-latest
if: github.event.pull_request.merged == true
steps:
- name: Execute in cloud
run: orchex execute -f manifest.yaml --cloud
env:
ORCHEX_API_KEY: ${{ secrets.ORCHEX_API_KEY }}Troubleshooting
Quota Exceeded
# Error: Monthly quota exceeded
# Solution 1: Upgrade tier
orchex cloud upgrade --tier pro
# Solution 2: Use local execution
orchex execute -f manifest.yaml # No --cloud flag
# Solution 3: Wait for reset
orchex cloud quota # Check reset dateRate Limited
# Error: Rate limit exceeded
# Solution 1: Wait and retry (automatic)
# Orchex retries automatically with backoff
# Solution 2: Reduce request rate
# Add delays between executions
# Solution 3: Upgrade tier for higher limits
orchex cloud upgrade --tier teamUnexpected Costs
# Review recent executions
orchex cloud usage --detailed
# Check cost breakdown
orchex cloud costs --execution <execution-id>
# Set budget limits
orchex cloud budget set 50.00Summary
Quick Reference
Use Cloud When:
- Complex/large tasks
- Team collaboration needed
- CI/CD automation
- Resource constrained locally
Use Local When:
- Quick iterations
- Sensitive data
- Learning/testing
- Cost sensitive
Optimize Costs By:
- Right-sizing streams
- Minimizing context
- Using local for iterations
- Setting budget limits
- Cleaning up old data
Manage Quotas By:
- Monitoring regularly
- Batching similar tasks
- Coordinating team usage
- Using local for small tasks