Core Concepts

How the ingest → compile → query → generate pipeline works — the four stages of the Senso knowledge base.

Everything in Senso follows one pipeline: ingest raw sources, compile them into a verified knowledge base, query the compiled knowledge, and generate grounded content from it.

For organizing your ingested sources into folders, controlling access, and managing versions, see Knowledge Base.

---

Ingestion

Ingestion is how raw sources enter your knowledge base. There are two ways to ingest:

1. File ingest — a two-step presigned-URL flow for PDFs, DOCX, TXT, and other file types 2. Raw text — ingest content directly from text or markdown via the API

Both paths end the same way: a background worker parses the raw source, splits it into chunks, and compiles it into your org's vector store for querying.

Ingesting files

POST /org/kb/upload accepts metadata for up to 10 files and returns a presigned S3 URL for each.

import hashlib, os, requests

KEY  = os.environ["SENSO_API_KEY"]
BASE = "https://apiv2.senso.ai/api/v1"
HEADERS = {"X-API-Key": KEY, "Content-Type": "application/json"}

file_bytes = open("lending-policy.pdf", "rb").read()

resp = requests.post(f"{BASE}/org/kb/upload", headers=HEADERS, json={
    "files": [{
        "filename":         "lending-policy.pdf",
        "file_size_bytes":  len(file_bytes),
        "content_type":     "application/pdf",
        "content_hash_md5": hashlib.md5(file_bytes).hexdigest(),
    }]
})
result = resp.json()["results"][0]
print(result["content_id"], result["status"])

curl -X POST https://apiv2.senso.ai/api/v1/org/kb/upload \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "filename": "lending-policy.pdf",
        "file_size_bytes": 245760,
        "content_type": "application/pdf",
        "content_hash_md5": "d41d8cd98f00b204e9800998ecf8427e"
      }
    ]
  }'

Each file in the request needs four fields:

Field	Type	Description
`filename`	string	Original filename
`file_size_bytes`	integer	Must be >= 1
`content_type`	string	MIME type (e.g. `application/pdf`)
`content_hash_md5`	string	MD5 hex digest, exactly 32 characters

You can also specify kb_folder_node_id in the request body to ingest files into a specific folder. See Knowledge Base for details on organizing sources.

The response tells you what happened to each file:

{
  "summary": { "total": 1, "success": 1, "skipped": 0 },
  "results": [
    {
      "ingestion_run_id": "a1b2c3d4-...",
      "content_id": "e5f6a7b8-...",
      "filename": "lending-policy.pdf",
      "status": "upload_pending",
      "upload_url": "https://s3.amazonaws.com/...",
      "expires_in": 3600,
      "error": null,
      "existing_content_id": null
    }
  ]
}

Status	Meaning
`upload_pending`	Ready — PUT the file to `upload_url`
`conflict`	Another ingestion run is active for this content
`duplicate`	Same file appeared twice in this request
`invalid`	Metadata failed validation

Important: A 200 response from the ingest endpoint means your request was accepted — NOT that compilation is complete. The response will show status: "upload_pending". The upload returns a content id; resolve it to a KB node id (e.g. GET /org/kb/find?q=, matching content_id) and poll GET /org/kb/nodes/{id}/content until processing_status is "complete" before querying. (GET /org/content/{id} is for generated content — it returns a 400 for KB sources.)

Uploading to S3

PUT the raw source to the presigned URL. No API key needed — the URL is pre-authenticated:

upload_url = result["upload_url"]
# Content-Type must match the value you declared in the upload request.
requests.put(upload_url, data=file_bytes, headers={"Content-Type": "application/pdf"})

curl -X PUT "https://s3.amazonaws.com/..." \
  --upload-file lending-policy.pdf

Once uploaded, a background worker compiles the raw source — parses it, splits it into chunks, generates vector embeddings, and indexes them. Poll the file's KB node (GET /org/kb/nodes/{id}/content) until processing_status is "complete" before querying.

Ingesting raw text

If your raw source isn't in a file, you can ingest it directly with POST /org/kb/raw:

resp = requests.post(f"{BASE}/org/kb/raw", headers=HEADERS, json={
    "title": "Lending Policy FAQ",
    "text": "# Lending Policy FAQ\n\nQ: What is the maximum LTV?...",
})
print(resp.json()["id"])

curl -X POST https://apiv2.senso.ai/api/v1/org/kb/raw \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"title": "Lending Policy FAQ", "text": "# Lending Policy FAQ\n\n..."}'

Updating existing sources

To replace a file with a new version, use PUT /org/kb/nodes/{id}/file — same presigned-URL flow, same response shape, just targets an existing document instead of creating a new one. The knowledge base recompiles automatically.

To update raw text, use PUT /org/kb/nodes/{id}/raw (full replace) or PATCH /org/kb/nodes/{id}/raw (partial update). See Knowledge Base for details.

---

Querying

Three endpoints for querying your compiled knowledge base — same underlying vector search, different output shapes. All take the same request body:

{
  "query": "string (required)",
  "max_results": 5,
  "content_ids": ["uuid", "uuid"],
  "require_scoped_ids": false
}

Field	Type	Description
`query`	string	The search query. Required.
`max_results`	integer	Maximum number of chunks to return. Default: 5, max: 20.
`content_ids`	uuid[]	Restrict query to specific content items. Omit to query all compiled knowledge.
`require_scoped_ids`	boolean	When `true`, only returns chunks from the specified `content_ids`. Default: `false`.

Full query — `POST /org/search`

Returns matching chunks plus an AI-generated answer grounded in your compiled knowledge base. Use this when you want a ready-made answer backed by sources.

resp = requests.post(f"{BASE}/org/search", headers=HEADERS, json={
    "query": "What is the maximum LTV for a home equity loan?",
    "max_results": 5,
})
data = resp.json()
print(f"Answer: {data['answer']}")
for r in data["results"]:
    print(f"  [{r['score']:.2f}] {r['title']}: {r['chunk_text'][:80]}...")

curl -X POST https://apiv2.senso.ai/api/v1/org/search \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the maximum LTV for a home equity loan?",
    "max_results": 5
  }'

Response:

{
  "query": "What is the maximum LTV for a home equity loan?",
  "answer": "The maximum loan-to-value ratio for a home equity loan is 85%...",
  "results": [
    {
      "content_id": "uuid",
      "version_id": "uuid",
      "chunk_index": 0,
      "chunk_text": "Maximum LTV of 85% applies to primary residences...",
      "score": 0.94,
      "title": "Home Equity Loan Policy v3",
      "vector_id": "string"
    }
  ],
  "total_results": 12,
  "max_results": 5,
  "processing_time_ms": 847
}

Context query — `POST /org/search/context`

Same vector search, but skips AI answer generation. Returns raw chunks directly. Use this when you want to feed compiled context into your own LLM or agent pipeline.

resp = requests.post(f"{BASE}/org/search/context", headers=HEADERS, json={
    "query": "home equity loan requirements",
    "max_results": 5,
})
data = resp.json()
for r in data["results"]:
    print(f"  [{r['score']:.2f}] {r['chunk_text'][:100]}...")

curl -X POST https://apiv2.senso.ai/api/v1/org/search/context \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "home equity loan requirements", "max_results": 5}'

Content query — `POST /org/search/content`

Deduplicates by content item and returns only IDs and titles. Use this when you just need to know which ingested sources are relevant.

resp = requests.post(f"{BASE}/org/search/content", headers=HEADERS, json={
    "query": "home equity loan requirements",
})
data = resp.json()
for c in data["contents"]:
    print(f"  {c['content_id']}: {c['title']}")

curl -X POST https://apiv2.senso.ai/api/v1/org/search/content \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "home equity loan requirements"}'

Streaming query — `POST /org/search/stream`

Same query as /org/search, but streams the answer as Server-Sent Events. Answer tokens arrive first, then sources after the answer completes.

Event sequence:

1. event: token (repeated) — individual answer tokens as they are generated 2. event: sources — search result chunks and metadata 3. event: done — stream complete

import sseclient  # pip install sseclient-py

resp = requests.post(f"{BASE}/org/search/stream", headers=HEADERS, json={
    "query": "What is the refund policy?",
}, stream=True)

client = sseclient.SSEClient(resp)
for event in client.events():
    if event.event == "token":
        print(event.data, end="", flush=True)
    elif event.event == "sources":
        print(f"\n\nSources: {event.data}")
    elif event.event == "done":
        break

curl -N -X POST https://apiv2.senso.ai/api/v1/org/search/stream \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the refund policy?"}'

---

Content generation

Note: Content generation can take 30–90 seconds depending on knowledge base size and complexity. If you hit a 504 timeout, the generation may still be processing server-side. We recommend starting with querying (which responds in under 1 second) and adding generation once your core agent loop is working.

Content generation queries your compiled knowledge base to produce new verified content — articles, FAQs, social posts — shaped by content types and grounded in your ingested raw sources.

Brand kit

Before generating content, set up your org's brand kit with PUT /org/brand-kit. The guidelines object accepts a fixed set of keys (unknown keys are rejected with a 400) — these are used by the content engine during generation:

resp = requests.put(f"{BASE}/org/brand-kit", headers=HEADERS, json={
    "guidelines": {
        "brand_name":           "Your Company",
        "brand_domain":         "yourcompany.com",
        "brand_description":    "What your company does in one sentence",
        "voice_and_tone":       "How you want the AI to write",
        "author_persona":       "Who the AI should write as",
        "global_writing_rules": ["Rule 1", "Rule 2"],
    }
})
print(resp.json()["brand_kit_id"])

curl -X PUT https://apiv2.senso.ai/api/v1/org/brand-kit \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "guidelines": {
      "brand_name": "Your Company",
      "brand_domain": "yourcompany.com",
      "brand_description": "What your company does in one sentence",
      "voice_and_tone": "How you want the AI to write",
      "author_persona": "Who the AI should write as",
      "global_writing_rules": ["Rule 1", "Rule 2"]
    }
  }'

The brand kit is org-wide — set it once and it applies to all content generation. See Brand Kit for the full reference.

Content types

A content type is a template that defines the kind of output you want. You create them with POST /org/content-types:

resp = requests.post(f"{BASE}/org/content-types", headers=HEADERS, json={
    "name": "FAQ Article",
    "config": {
        "template":       "A concise FAQ article under 800 words. One question, one clear answer.",
        "cta_text":       "Contact support",
        "cta_destination":"https://yourcompany.com/support",
        "writing_rules":  ["Use active voice", "Include one example"],
    }
})
print(resp.json()["content_type_id"])

curl -X POST https://apiv2.senso.ai/api/v1/org/content-types \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "FAQ Article",
    "config": {
      "template": "A concise FAQ article under 800 words. One question, one clear answer.",
      "cta_text": "Contact support",
      "cta_destination": "https://yourcompany.com/support",
      "writing_rules": ["Use active voice", "Include one example"]
    }
  }'

The config object accepts a fixed set of keys (unknown keys are rejected with a 400): template, template_spec, cta_text, cta_destination, and writing_rules (an array of strings). Content types are reusable across any number of generation calls. See Content Types for the full reference.

Prompts

A prompt in Senso is a question you want AI to answer well about your organization — things like "What are the current mortgage rates?" or "How do I open a business account?". These are the questions that drive content generation.

Create prompts with POST /org/prompts:

resp = requests.post(f"{BASE}/org/prompts", headers=HEADERS, json={
    "question_text": "What are the current mortgage rates?",
    "type": "awareness",
})
prompt_id = resp.json()["prompt_id"]

curl -X POST https://apiv2.senso.ai/api/v1/org/prompts \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question_text": "What are the current mortgage rates?",
    "type": "awareness"
  }'

Types: awareness, consideration, decision, evaluation — reflecting where in the customer journey the question sits.

A note on naming: In the API, prompts are also called "GEO questions" (Generative Engine Optimization). You'll see geo_question_id in request/response fields — this is the same thing as a prompt ID. The content generation endpoints use geo_question_id because they originated from the GEO workflow, but conceptually it's just "the question you want answered."

Generating content

POST /org/content-generation/sample starts an async job that generates content for a specific question using a specific content type. Poll /org/content-generation/sample-jobs/{sample_job_id} until the job completes:

import time

resp = requests.post(f"{BASE}/org/content-generation/sample", headers=HEADERS, json={
    "geo_question_id": prompt_id,
    "content_type_id": content_type_id,
})
job = resp.json()

while True:
    status = requests.get(
        f"{BASE}/org/content-generation/sample-jobs/{job['sample_job_id']}",
        headers=HEADERS,
    ).json()
    if status["status"] in {"completed", "failed", "expired"}:
        break
    time.sleep(2)

if status["status"] != "completed":
    raise RuntimeError(status.get("error", {}).get("message", "Generation failed"))

gen = status["result"]
print(f"Title: {gen['seo_title']}")
print(f"Slug:  {gen['url_slug']}")
print(gen["raw_markdown"][:300])

curl -X POST https://apiv2.senso.ai/api/v1/org/content-generation/sample \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "geo_question_id": "prompt-uuid",
    "content_type_id": "content-type-uuid"
  }'

The submit response contains the job id:

{
  "message": "Content generation sample job accepted",
  "sample_job_id": "uuid",
  "org_id": "uuid",
  "status": "queued"
}

When the job completes, the polling response includes a complete persisted content item ready for review:

{
  "status": "completed",
  "result": {
    "content_id": "uuid",
    "version_id": "uuid",
    "version_num": 1,
    "raw_markdown": "# Current Mortgage Rates\n\nBased on our latest rate sheet...",
    "seo_title": "Current Mortgage Rates - Your Credit Union",
    "url_slug": "current-mortgage-rates",
    "editorial_status": "draft",
    "publish_status": "skipped"
  }
}

Publishing and drafting

Once content is generated, you can publish it or save it as a draft for review:

POST /org/content-engine/publish — publishes to configured destinations (requires geo_question_id, raw_markdown, seo_title)

POST /org/content-engine/draft — saves as a draft for editorial review

You can also trigger a full generation run across multiple prompts with POST /org/content-generation/run, optionally scoped to specific prompt IDs.

The generation pipeline

The typical workflow:

1. Ingest raw sources into your knowledge base
2. Set up your brand kit (voice, persona, writing rules)
3. Create content types (your output templates)
4. Create prompts (the questions you want answered)
5. Generate — the engine queries your compiled knowledge base and produces grounded content
6. Review drafts, edit if needed
7. Publish

---

Content lifecycle

Every content item — whether ingested or generated — has a lifecycle:

Ingested raw sources move through a compilation pipeline:

upload_pending → processing → complete

Poll the file's KB node (GET /org/kb/nodes/{id}/content) until processing_status is "complete" before querying against it.

Generated content has an editorial workflow:

draft → review → published (or rejected)

Use GET /org/content/verification to list items awaiting review. Reject with POST /org/content/versions/{versionId}/reject, restore with POST /org/content/versions/{versionId}/restore.

---

Errors

Status	Meaning
`400`	Bad request — missing required fields or validation failure
`401`	Unauthorized — missing or invalid API key
`402`	Insufficient credits or spend limit reached
`404`	Resource not found
`409`	Conflict — another operation is in progress for this resource
`422`	Unprocessable — valid request but can't be fulfilled (e.g. no publishers configured)