Intelligence Jobs | KINETK Graph Service API

All heavy intelligence work runs through one async endpoint pair:

POST /intelligence/jobs — submit a job, get a jobId immediately.
GET /intelligence/jobs/{id} — poll status; once succeeded, the response carries the result.

This page covers the lifecycle, the two supported kind values, the cache + dedup semantics, and the failure modes.

Why async

The intelligence pipeline does real work: text embeddings, multimodal vector search fan-out, relational joins, KMeans clustering, two LLM calls. Typical end-to-end latency is 5–20 seconds. The synchronous API path caps at ~29 seconds, so heavy work runs async — submit returns in under 500 ms, and the worker has a 5-minute budget.

Supported kinds

`kind`	Input	Output	Cache window
`records`	`query` + filters	The matching content items	15 min
`insights`	`query`+ filters (+ optional `includeIntelligenceSignals`)	Structured narrative/tag/theme signals	1 h

See the API Reference for full shapes.

Input filters apply to the records + insights kind:

platforms: e.g. ["TIKTOK", "INSTAGRAM"]. Defaults to all.
window: 7d | 30d | all. Required — there is no default; omitting it returns 400. Use all for no time filter. A 24h window is coming soon.
limit: the number of records to retrieve, 100–10000. Required — there is no default; omitting it returns 400, and a value above 10000 returns 400. Jobs are billed per record, so the limit is how you choose your spend (1000 is a sensible starting point).

Additionally insights accepts an optional includeIntelligenceSignals boolean — set it to true to also get the LLM-written signals arrays (otherwise the result is structured data only).

Submit response shapes

A submit can return one of three things depending on dedup + cache state:

1 // 1. Fresh — new work queued, credits reserved at the estimate
2 HTTP 202
3 { "jobId": "...", "status": "queued", "estimated_cost": 1.333, "statusUrl": "/intelligence/jobs/..." }
4 
5 // 2. Dedup — identical input already running, same jobId
6 HTTP 202
7 { "jobId": "...", "status": "queued", "dedup": true, "statusUrl": "..." }
8 
9 // 3. Cache hit — succeeded result inline, no new run
10 HTTP 200
11 { "jobId": "...", "status": "succeeded", "result": { /* full payload */ }, "fromCache": true, "charged": 1.2 }

Always check the HTTP status (200 vs 202) and the optional fromCache / dedup flags before deciding whether to start polling.

Large results — presigned download URL

Responses from the job API have a size limit, so large results are not returned inline. When a job succeeds but its result is too large to return directly, the response omits the result field and instead includes a short-lived, presigned S3 URL. The client downloads the result directly from that URL.

1 {
2   "jobId": "01931f7e-...",
3   "kind": "records",
4   "status": "succeeded",
5   "resultStorage": "s3",
6   "resultUrl": "https://...s3.amazonaws.com/jobs/<id>.json?X-Amz-Signature=...",
7   "resultBytes": 8200431,
8   "resultExpiresAt": "2026-06-11T05:37:09.000Z"
9 }

Client contract: if result is absent and resultStorage === "s3", GET the resultUrl to download the full JSON payload. The URL is valid for ~1 h (and never past the job’s own TTL — see resultExpiresAt); if it has expired, just poll this endpoint again for a fresh one.

Billing & costs

Jobs are paid in credits:

Submit reserves, completion settles. A fresh 202 includes estimated_cost — an upper bound computed from your requested limit and window, reserved at submit. The final charge is computed from the records actually returned (≤ the estimate) and the unused remainder is released. Cache hits (200) are charged at the full rate — the body’s charged field is the cost.
Insufficient balance → 402. The body includes credits_required, credits_available and a machine-readable recommendation: the largest affordable limit for your window, and (for records) per-window alternatives so you can trade freshness for volume.
Headers. Every billed response carries X-Kinetk-Credits-Used (this call) and X-Kinetk-Credits-Remaining (your balance).
insights is flat-priced: 0.5 credits per request for a fixed 3000-record scan. If the run matches fewer than 1500 records, the job fails (poll shows status: "failed" with a refine-your-query message) and the 0.5 credits are refunded in full.
Failed jobs are fully refunded. Any job that ends failed releases its entire reservation.

Polling

$ curl -H "x-api-key: $API_KEY" "$API_BASE/intelligence/jobs/$JOB"

Poll every 2–5 seconds. Status flows: queued → running → succeeded | failed.

1 // running
2 { "jobId": "...", "kind": "insights", "status": "running",
3   "submittedAt": 1745859300000, "startedAt": 1745859302100 }
4 
5 // succeeded
6 { "jobId": "...", "kind": "insights", "status": "succeeded",
7   "submittedAt": 1745859300000, "startedAt": 1745859302100, "completedAt": 1745859315400,
8   "result": { /* signals payload — see API reference */ } }
9 
10 // failed
11 { "jobId": "...", "status": "failed",
12   "error": "embedding service failed (403): permission denied",
13   "submittedAt": ..., "startedAt": ..., "completedAt": ... }

Status response codes:

Status	Body
`200`	Any job state (`queued` / `running` / `succeeded` / `failed`)
`404`	`{ "error": "job not found" }` — wrong `jobId`
`410`	`{ "error": "job result expired" }` — job is older than its TTL (~24 h after completion)

Cache semantics

Submit computes inputHash = sha256(canonicalJson({ kind, input })) and checks for a recent successful job with the same hash. If one exists within the per-kind freshness window above, the cached result is returned inline (status 200, fromCache: true) — charged at the full rate. To get fresher data, vary the input (e.g. a different window) or wait for the freshness window to lapse.

Dedup semantics

Two concurrent identical submits will both hit the dedup check. The first creates a row and enqueues; the second observes the queued/running row and returns the same jobId with dedup: true. No second SQS message is sent — both pollers converge on the same result.

There’s a small race window where two submits arriving in the same millisecond can both miss the dedup check and both queue. In that case both jobs run independently and produce identical results — the cost is duplicate work, not data corruption. This is documented as a v1 accepted trade-off.

TTL

Job rows are deleted from DynamoDB ~24 hours after completion. After that, GET /intelligence/jobs/{id} returns 410 { "error": "job result expired" }. Persist anything you need long-term on your side.

Using the MCP instead

If polling-loop boilerplate is a pain — especially inside an AI coding agent — install the kinetk MCP server. It exposes three tools that wrap the same flow:

create_context_job → POST /intelligence/jobs (handles cache hit, dedup, retries)
get_context_job_status → cheap poll
get_context_job_result → slim or verbose envelope

See MCP Installation.