# Arc API Reference (machine-readable) # Human docs: https://arc.cornerstone.sh/docs # Dashboard: https://arc.cornerstone.sh/dashboard ## What is Arc? Arc is a drop-in OpenAI-compatible AI proxy. It sits between your application and AI providers (OpenAI, Anthropic, etc.), adding unified logging, routing, caching, cost tracking, and provider failover. No SDK changes required — only two config values change. ## Base URL https://api-arc.cornerstone.sh/v1 All endpoints mirror the OpenAI API. Currently supported: POST /v1/chat/completions (streaming supported) ## Authentication Pass your Arc key as a Bearer token in every request: Authorization: Bearer arc_live_ Arc authenticates you, looks up your workspace's stored provider key, and forwards the request. Your underlying OpenAI/Anthropic key never leaves Arc's servers. ## Quickstart curl https://api-arc.cornerstone.sh/v1/chat/completions \ -H "Authorization: Bearer arc_live_" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [{ "role": "user", "content": "Hello" }] }' Response is identical to OpenAI's response format. ## Routes Routes tag requests with a named configuration defined in the Arc dashboard. Set via header: X-Arc-Route: Example: X-Arc-Route: customer-support Route keys are slugs configured in the dashboard (e.g. "customer-support", "summarization"). Requests without X-Arc-Route are logged as "Direct" and receive no route-level configuration. ### Model injection If a route has a primary model configured, you may omit the "model" field in the request body. Arc injects the route's model automatically before forwarding to the provider. ### Request logging All requests are logged with: timestamp, provider, model, prompt tokens, completion tokens, cost (USD), latency (ms), cache hit status, HTTP status code, route, and origin country. Message bodies are NOT stored — only metadata. ## System Prompts Routes can have a system prompt configured in the dashboard. When present, Arc prepends it as a system message at index 0 of the messages array before forwarding to the provider. Example — route "summarization" has system prompt "Summarize the following in 3 bullet points.": // Your request body: { "messages": [{ "role": "user", "content": "" }] } // What the provider receives: { "messages": [ { "role": "system", "content": "Summarize the following in 3 bullet points." }, { "role": "user", "content": "" } ] } To skip system prompt injection for a specific request: X-Arc-No-Inject: 1 Accepted values: "1", "true", "yes" (case-insensitive). ## Memory Arc can maintain per-user conversation memory across requests via memory pools. A pool is attached to a route in the dashboard. Pass a client identifier to enable memory injection: X-Arc-Client-ID: Or use the standard OpenAI "user" field in the request body (header takes precedence). Arc prepends the conversation history (summary + recent turns) into the messages array before forwarding. Message content is never persisted in permanent storage — only the rolling window and compressed summary are stored. Pool configuration options: TTL (days), window size (messages), summarization thresholds (token count, turn count, idle hours). Managed at arc.cornerstone.sh/dashboard/memory. ## Caching Routes support two cache modes, configured in the dashboard: exact — Returns cached response for byte-identical message arrays semantic — Returns cached response for semantically similar requests (embedding similarity) off — No caching (default) Cache hits are logged in request metadata (cache_hit field). Message bodies are not cached — only response text and embeddings. ## Rate Limiting Routes can enforce per-key rate limits. Configuration: window (1m/5m/15m/1h/6h/24h), max requests, action (reject or warn). reject — Returns 429 Too Many Requests with headers: Retry-After: X-Arc-Rate-Limit-Remaining: 0 X-Arc-Rate-Limit-Reset: { "detail": "Rate limit exceeded" } warn — Passes request through, adds to response: X-Arc-Rate-Limit-Warning: true ## Smart Tier Smart Tier routes each request to a different model based on complexity, scored 0–1 from the messages content in <1ms (sync, no I/O). Configuration is a JSON array of tiers: [ { "label": "simple", "complexityMax": 0.35, "model": "gpt-4o-mini" }, { "label": "standard", "complexityMax": 0.75, "model": "gpt-4o" }, { "label": "complex", "complexityMax": 1.00, "model": "gpt-4o" } ] Arc selects the first tier where score <= complexityMax. The response "model" field reflects whichever tier model handled the request. No header or body changes required. Smart Tier can be configured manually in the route Config tab or enabled automatically via Auto-Tune (background shadow testing + AI quality judge). ## Geolocation Arc logs the country of each request. By default it uses the country of the calling server. To log the end user's country, forward their IP in X-Forwarded-For: X-Forwarded-For: Arc performs an async geo-lookup (ip-api.com) in the background — zero added latency. If multiple IPs are present (comma-separated), Arc uses the leftmost (original client) IP. ## Headers Reference Header | Required | Description ----------------------------|----------|--------------------------------------------- Authorization | Yes | Bearer arc_live_ Content-Type | Yes | application/json X-Arc-Route | No | Route key for route-level config X-Arc-No-Inject | No | Set to "1" to skip system prompt injection X-Forwarded-For | No | End-user IP for geolocation logging X-Arc-Client-ID | No | Client ID for conversation memory injection ## Error Responses Arc returns standard HTTP status codes. Error bodies follow OpenAI's error format: { "detail": "" } 400 Bad request (missing model, invalid JSON) 401 Invalid or missing Arc key 502 Provider returned an error (provider message included in detail) 504 Provider request timed out ## Notes for AI Assistants - Arc is fully OpenAI-API-compatible. Code written for OpenAI works unchanged. - The only required change is: base_url → https://api-arc.cornerstone.sh/v1 - Arc keys start with "arc_live_" and are distinct from OpenAI keys ("sk-...") - System prompts are injected server-side; the provider sees them as normal messages - X-Arc-Route, X-Arc-No-Inject, X-Arc-Client-ID are Arc-specific headers; stripped before forwarding - Memory injection is transparent: Arc modifies the messages array server-side before forwarding - Smart Tier is transparent: the "model" in the response reflects whichever tier model was used - Rate limit 429s follow standard HTTP conventions and include Retry-After