Stackon
build · retrieval

Ground every run in your own docs.

Paste in your ADRs, RFCs, and runbooks once. Stackon chunks and embeds them, then retrieves the most relevant pieces before each agent acts — so answers come from your team's decisions, not the model's guesses.

knowledge · retrievalpgvector
querypolicy on rate limiting third-party API calls?top 6
0.83docs/adrs/004-rate-limiting.md

Third-party calls are capped at 60 rpm per token; retries use exponential backoff with a 3-attempt budget…

0.71rfc/auth-refresh.md

Refresh tokens rotate on every use; the prior token stays valid for a 30s grace window to absorb races…

0.58runbooks/oncall.md

On a 429 storm, shed load at the proxy before paging — the budget breach webhook fires automatically…

3 chunks → # Relevant team contextcited by source
ADR + RFC + runbook ingestpgvector cosine retrievalGrounded, cited by source

01

Ingest anything your team wrote down

Drop in an ADR, an RFC, a postmortem, a runbook — any text. Stackon splits it on paragraph boundaries into ~500-token chunks with a 50-token overlap, embeds each one with OpenAI text-embedding-3-small, and indexes the 1536-dim vectors in pgvector. Every chunk carries an optional citation so you always know where an answer came from.

knowledge · retrievalpgvector
querypolicy on rate limiting third-party API calls?top 6
0.83docs/adrs/004-rate-limiting.md

Third-party calls are capped at 60 rpm per token; retries use exponential backoff with a 3-attempt budget…

0.71rfc/auth-refresh.md

Refresh tokens rotate on every use; the prior token stays valid for a 30s grace window to absorb races…

0.58runbooks/oncall.md

On a 429 storm, shed load at the proxy before paging — the budget breach webhook fires automatically…

3 chunks → # Relevant team contextcited by source

02

Retrieved before the agent acts

Flip Use Knowledge on a canvas and every node runs against your index first. Stackon embeds the task, pulls the top six chunks above a 0.4 cosine-similarity floor, and prepends them as a Relevant team context block in the system prompt — instructing the agent to cite by source number when it leans on one.

canvas · pr-reviewrunning
Plannerdone
Coderlive
Reviewerqueued
agent.run · 3 spans2 / 3 nodes · streaming

03

Test retrieval before you trust it

A built-in search panel lets you query the index by hand and see exactly which chunks come back, each tagged with its similarity score and citation. No black box — you can read what the agent will read before you ever wire it into a run.

knowledge · retrievalpgvector
querypolicy on rate limiting third-party API calls?top 6
0.83docs/adrs/004-rate-limiting.md

Third-party calls are capped at 60 rpm per token; retries use exponential backoff with a 3-attempt budget…

0.71rfc/auth-refresh.md

Refresh tokens rotate on every use; the prior token stays valid for a 30s grace window to absorb races…

0.58runbooks/oncall.md

On a 429 storm, shed load at the proxy before paging — the budget breach webhook fires automatically…

3 chunks → # Relevant team contextcited by source

04

Trustworthy by construction

Sources and chunks are scoped to your team by row-level security, and every ingest is written to the compliance audit log. Each retrieval records the chunk IDs it used onto the run's trace span, so a grounded answer is auditable end to end — from the source you pasted to the span that consumed it.

trace · run_8c4fok · 742ms · $0.0053
agent.plan742ms
tools.search_code86ms
llm.complete_refactor612ms
tools.edit_file78ms
evals.no_regression54ms
agentllmtooleval5 spans · 3,007 tok

text-embedding-3-small · 1536-dim

Embedding model

pgvector cosine · top-6 · min 0.4

Retrieval

~500 tokens · 50 overlap

Chunking

Speed plus trust — prove your agents got better this week.

Knowledge is one piece of Stackon, the observability-first workspace for teams running Claude and Codex. Start free and instrument your first run today.