All heavy intelligence work runs through one async endpoint pair:
POST /intelligence/jobs — submit a job, get a jobId immediately.GET /intelligence/jobs/{id} — poll status; once succeeded, the response carries the result.This page covers the lifecycle, the four supported kind values, the cache + dedup semantics, and the failure modes.
The intelligence pipeline does real work: text embeddings, multimodal vector search fan-out, relational joins, KMeans clustering, two LLM calls. Typical end-to-end latency is 5–20 seconds. The synchronous API path caps at ~29 seconds, so heavy work runs async — submit returns in under 500 ms, and the worker has a 5-minute budget.
intelligence_search / intelligence_discover take a free-text query. campaign_brief / llm_context take a structured campaign object — see the API Reference for full shapes.
Common input filters apply to all four:
platforms: e.g. ["TIKTOK", "INSTAGRAM"]. Defaults to all.window: 24h | 7d | 30d | all. Defaults to all (no time filter).limit / topK: max items to return. Default 1000, hard cap 1000 in the MCP client (configurable in direct HTTP).expandQuery: opt in to LLM-backed query expansion (search-only).A submit can return one of three things depending on dedup + cache state:
Always check the HTTP status (200 vs 202) and the optional fromCache / dedup flags before deciding whether to start polling.
Poll every 2–5 seconds. Status flows: queued → running → succeeded | failed.
Status response codes:
Submit computes inputHash = sha256(canonicalJson({ kind, input })) and checks for a recent successful job with the same hash. If one exists within the per-kind freshness window above, the cached result is returned inline (status 200, fromCache: true).
To bypass the cache on a single call, set input.debug: true:
This forces a fresh worker run regardless of cache state. Use it sparingly — heavy kinds are expensive.
Two concurrent identical submits will both hit the dedup check. The first creates a row and enqueues; the second observes the queued/running row and returns the same jobId with dedup: true. No second SQS message is sent — both pollers converge on the same result.
There’s a small race window where two submits arriving in the same millisecond can both miss the dedup check and both queue. In that case both jobs run independently and produce identical results — the cost is duplicate work, not data corruption. This is documented as a v1 accepted trade-off.
When a job’s result exceeds 300 KB, the worker writes it to S3 and stores a pointer in DynamoDB. The status endpoint rehydrates transparently — your client sees the full result inline either way.
If the S3 write fails, the worker writes a truncated preview envelope to DynamoDB and logs the spill failure server-side. This is rare; surface as result.__truncated: true in your client if you want defensive handling.
Job rows are deleted from DynamoDB ~24 hours after completion. After that, GET /intelligence/jobs/{id} returns 410 { "error": "job result expired" }. Persist anything you need long-term on your side.
If polling-loop boilerplate is a pain — especially inside an AI coding agent — install the kinetk MCP server. It exposes three tools that wrap the same flow:
create_context_job → POST /intelligence/jobs (handles cache hit, dedup, retries)get_context_job_status → cheap pollget_context_job_result → slim or verbose envelopeSee MCP Installation.