Core Concepts
How the ingest → compile → query → generate pipeline works — the four stages of the Senso knowledge base.
For organizing your ingested sources into folders, controlling access, and managing versions, see Knowledge Base.
---
Ingestion
Ingestion is how raw sources enter your knowledge base. There are two ways to ingest:
1. File ingest — a two-step presigned-URL flow for PDFs, DOCX, TXT, and other file types 2. Raw text — ingest content directly from text or markdown via the API
Both paths end the same way: a background worker parses the raw source, splits it into chunks, and compiles it into your org's vector store for querying.
Ingesting files
POST /org/kb/upload accepts metadata for up to 10 files and returns a presigned S3 URL for each.
import hashlib, os, requests
KEY = os.environ["SENSO_API_KEY"]
BASE = "https://apiv2.senso.ai/api/v1"
HEADERS = {"X-API-Key": KEY, "Content-Type": "application/json"}
file_bytes = open("lending-policy.pdf", "rb").read()
resp = requests.post(f"{BASE}/org/kb/upload", headers=HEADERS, json={
"files": [{
"filename": "lending-policy.pdf",
"file_size_bytes": len(file_bytes),
"content_type": "application/pdf",
"content_hash_md5": hashlib.md5(file_bytes).hexdigest(),
}]
})
result = resp.json()["results"][0]
print(result["content_id"], result["status"])curl -X POST https://apiv2.senso.ai/api/v1/org/kb/upload \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"files": [
{
"filename": "lending-policy.pdf",
"file_size_bytes": 245760,
"content_type": "application/pdf",
"content_hash_md5": "d41d8cd98f00b204e9800998ecf8427e"
}
]
}'Each file in the request needs four fields:
| Field | Type | Description |
|---|---|---|
filename | string | Original filename |
file_size_bytes | integer | Must be >= 1 |
content_type | string | MIME type (e.g. application/pdf) |
content_hash_md5 | string | MD5 hex digest, exactly 32 characters |
kb_folder_node_id in the request body to ingest files into a specific folder. See Knowledge Base for details on organizing sources.The response tells you what happened to each file:
{
"summary": { "total": 1, "success": 1, "skipped": 0 },
"results": [
{
"ingestion_run_id": "a1b2c3d4-...",
"content_id": "e5f6a7b8-...",
"filename": "lending-policy.pdf",
"status": "upload_pending",
"upload_url": "https://s3.amazonaws.com/...",
"expires_in": 3600,
"error": null,
"existing_content_id": null
}
]
}| Status | Meaning |
|---|---|
upload_pending | Ready — PUT the file to upload_url |
conflict | Another ingestion run is active for this content |
duplicate | Same file appeared twice in this request |
invalid | Metadata failed validation |
Important: A 200 response from the ingest endpoint means your request was accepted — NOT that compilation is complete. The response will showstatus: "upload_pending". You must pollGET /org/content/{id}untilprocessing_statusis"complete"before querying.
Uploading to S3
PUT the raw source to the presigned URL. No API key needed — the URL is pre-authenticated:
upload_url = result["upload_url"]
requests.put(upload_url, data=file_bytes)curl -X PUT "https://s3.amazonaws.com/..." \
--upload-file lending-policy.pdfOnce uploaded, a background worker compiles the raw source — parses it, splits it into chunks, generates vector embeddings, and indexes them. Poll GET /org/content/{id} until processing_status is "complete" before querying.
Ingesting raw text
If your raw source isn't in a file, you can ingest it directly with POST /org/kb/raw:
resp = requests.post(f"{BASE}/org/kb/raw", headers=HEADERS, json={
"title": "Lending Policy FAQ",
"text": "# Lending Policy FAQ\n\nQ: What is the maximum LTV?...",
})
print(resp.json()["id"])curl -X POST https://apiv2.senso.ai/api/v1/org/kb/raw \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"title": "Lending Policy FAQ", "text": "# Lending Policy FAQ\n\n..."}'Updating existing sources
To replace a file with a new version, use PUT /org/kb/nodes/{id}/file — same presigned-URL flow, same response shape, just targets an existing document instead of creating a new one. The knowledge base recompiles automatically.
To update raw text, use PUT /org/kb/nodes/{id}/raw (full replace) or PATCH /org/kb/nodes/{id}/raw (partial update). See Knowledge Base for details.
---
Querying
Three endpoints for querying your compiled knowledge base — same underlying vector search, different output shapes. All take the same request body:
{
"query": "string (required)",
"max_results": 5,
"content_ids": ["uuid", "uuid"],
"require_scoped_ids": false
}| Field | Type | Description |
|---|---|---|
query | string | The search query. Required. |
max_results | integer | Maximum number of chunks to return. Default: 5, max: 20. |
content_ids | uuid[] | Restrict query to specific content items. Omit to query all compiled knowledge. |
require_scoped_ids | boolean | When true, only returns chunks from the specified content_ids. Default: false. |
Full query — POST /org/search
Returns matching chunks plus an AI-generated answer grounded in your compiled knowledge base. Use this when you want a ready-made answer backed by sources.
resp = requests.post(f"{BASE}/org/search", headers=HEADERS, json={
"query": "What is the maximum LTV for a home equity loan?",
"max_results": 5,
})
data = resp.json()
print(f"Answer: {data['answer']}")
for r in data["results"]:
print(f" [{r['score']:.2f}] {r['title']}: {r['chunk_text'][:80]}...")curl -X POST https://apiv2.senso.ai/api/v1/org/search \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the maximum LTV for a home equity loan?",
"max_results": 5
}'Response:
{
"query": "What is the maximum LTV for a home equity loan?",
"answer": "The maximum loan-to-value ratio for a home equity loan is 85%...",
"results": [
{
"content_id": "uuid",
"version_id": "uuid",
"chunk_index": 0,
"chunk_text": "Maximum LTV of 85% applies to primary residences...",
"score": 0.94,
"title": "Home Equity Loan Policy v3",
"vector_id": "string"
}
],
"total_results": 12,
"max_results": 5,
"processing_time_ms": 847
}Context query — POST /org/search/context
Same vector search, but skips AI answer generation. Returns raw chunks directly. Use this when you want to feed compiled context into your own LLM or agent pipeline.
resp = requests.post(f"{BASE}/org/search/context", headers=HEADERS, json={
"query": "home equity loan requirements",
"max_results": 5,
})
data = resp.json()
for r in data["results"]:
print(f" [{r['score']:.2f}] {r['chunk_text'][:100]}...")curl -X POST https://apiv2.senso.ai/api/v1/org/search/context \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "home equity loan requirements", "max_results": 5}'Content query — POST /org/search/content
Deduplicates by content item and returns only IDs and titles. Use this when you just need to know which ingested sources are relevant.
resp = requests.post(f"{BASE}/org/search/content", headers=HEADERS, json={
"query": "home equity loan requirements",
})
data = resp.json()
for c in data["contents"]:
print(f" {c['content_id']}: {c['title']}")curl -X POST https://apiv2.senso.ai/api/v1/org/search/content \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "home equity loan requirements"}'Streaming query — POST /org/search/stream
Same query as /org/search, but streams the answer as Server-Sent Events. Answer tokens arrive first, then sources after the answer completes.
Event sequence:
1. event: token (repeated) — individual answer tokens as they are generated
2. event: sources — search result chunks and metadata
3. event: done — stream complete
import sseclient # pip install sseclient-py
resp = requests.post(f"{BASE}/org/search/stream", headers=HEADERS, json={
"query": "What is the refund policy?",
}, stream=True)
client = sseclient.SSEClient(resp)
for event in client.events():
if event.event == "token":
print(event.data, end="", flush=True)
elif event.event == "sources":
print(f"\n\nSources: {event.data}")
elif event.event == "done":
breakcurl -N -X POST https://apiv2.senso.ai/api/v1/org/search/stream \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the refund policy?"}'---
Content generation
Note: Content generation can take 30–90 seconds depending on knowledge base size and complexity. If you hit a 504 timeout, the generation may still be processing server-side. We recommend starting with querying (which responds in under 1 second) and adding generation once your core agent loop is working.
Content generation queries your compiled knowledge base to produce new verified content — articles, FAQs, social posts — shaped by content types and grounded in your ingested raw sources.
Brand kit
Before generating content, set up your org's brand kit with PUT /org/brand-kit. The guidelines field is free-form JSON — these canonical fields are used by the content engine during generation:
resp = requests.put(f"{BASE}/org/brand-kit", headers=HEADERS, json={
"guidelines": {
"brand_name": "Your Company",
"brand_domain": "yourcompany.com",
"brand_description": "What your company does in one sentence",
"voice_and_tone": "How you want the AI to write",
"author_persona": "Who the AI should write as",
"global_writing_rules": ["Rule 1", "Rule 2"],
}
})
print(resp.json()["brand_kit_id"])curl -X PUT https://apiv2.senso.ai/api/v1/org/brand-kit \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"guidelines": {
"brand_name": "Your Company",
"brand_domain": "yourcompany.com",
"brand_description": "What your company does in one sentence",
"voice_and_tone": "How you want the AI to write",
"author_persona": "Who the AI should write as",
"global_writing_rules": ["Rule 1", "Rule 2"]
}
}'The brand kit is org-wide — set it once and it applies to all content generation. See Brand Kit for the full reference.
Content types
A content type is a template that defines the kind of output you want. You create them with POST /org/content-types:
resp = requests.post(f"{BASE}/org/content-types", headers=HEADERS, json={
"name": "FAQ Article",
"config": {
"template": "A concise FAQ article under 800 words. One question, one clear answer.",
"cta_text": "Contact support",
"cta_destination":"https://yourcompany.com/support",
"writing_rules": ["Use active voice", "Include one example"],
}
})
print(resp.json()["content_type_id"])curl -X POST https://apiv2.senso.ai/api/v1/org/content-types \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "FAQ Article",
"config": {
"template": "A concise FAQ article under 800 words. One question, one clear answer.",
"cta_text": "Contact support",
"cta_destination": "https://yourcompany.com/support",
"writing_rules": ["Use active voice", "Include one example"]
}
}'The config field is free-form JSON — the canonical fields above (template, cta_text, cta_destination, writing_rules) are used by the content engine during generation. Content types are reusable across any number of generation calls. See Content Types for the full reference.
Prompts
A prompt in Senso is a question you want AI to answer well about your organization — things like "What are the current mortgage rates?" or "How do I open a business account?". These are the questions that drive content generation.
Create prompts with POST /org/prompts:
resp = requests.post(f"{BASE}/org/prompts", headers=HEADERS, json={
"question_text": "What are the current mortgage rates?",
"type": "awareness",
})
prompt_id = resp.json()["prompt_id"]curl -X POST https://apiv2.senso.ai/api/v1/org/prompts \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question_text": "What are the current mortgage rates?",
"type": "awareness"
}'Types: awareness, consideration, decision, evaluation — reflecting where in the customer journey the question sits.
A note on naming: In the API, prompts are also called "GEO questions" (Generative Engine Optimization). You'll seegeo_question_idin request/response fields — this is the same thing as a prompt ID. The content generation endpoints usegeo_question_idbecause they originated from the GEO workflow, but conceptually it's just "the question you want answered."
Generating content
POST /org/content-generation/sample generates content for a specific question using a specific content type:
resp = requests.post(f"{BASE}/org/content-generation/sample", headers=HEADERS, json={
"geo_question_id": prompt_id,
"content_type_id": content_type_id,
})
gen = resp.json()
print(f"Title: {gen['seo_title']}")
print(f"Slug: {gen['url_slug']}")
print(gen["raw_markdown"][:300])curl -X POST https://apiv2.senso.ai/api/v1/org/content-generation/sample \
-H "X-API-Key: $SENSO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"geo_question_id": "prompt-uuid",
"content_type_id": "content-type-uuid"
}'The response is a complete content item ready for review:
{
"content_id": "uuid",
"version_id": "uuid",
"version_num": 1,
"raw_markdown": "# Current Mortgage Rates\n\nBased on our latest rate sheet...",
"seo_title": "Current Mortgage Rates - Your Credit Union",
"url_slug": "current-mortgage-rates",
"editorial_status": "draft",
"publish_status": "skipped"
}Publishing and drafting
Once content is generated, you can publish it or save it as a draft for review:
POST /org/content-engine/publish — publishes to configured destinations (requires geo_question_id, raw_markdown, seo_title)POST /org/content-engine/draft — saves as a draft for editorial reviewYou can also trigger a full generation run across multiple prompts with POST /org/content-generation/run, optionally scoped to specific prompt IDs.
The generation pipeline
The typical workflow:
1. Ingest raw sources into your knowledge base
2. Set up your brand kit (voice, persona, writing rules)
3. Create content types (your output templates)
4. Create prompts (the questions you want answered)
5. Generate — the engine queries your compiled knowledge base and produces grounded content
6. Review drafts, edit if needed
7. Publish
---
Content lifecycle
Every content item — whether ingested or generated — has a lifecycle:
Ingested raw sources move through a compilation pipeline:
upload_pending → processing → complete
Poll GET /org/content/{id} until processing_status is "complete" before querying against it.
Generated content has an editorial workflow:
draft → review → published (or rejected)
Use GET /org/content/verification to list items awaiting review. Reject with POST /org/content/versions/{versionId}/reject, restore with POST /org/content/versions/{versionId}/restore.
---
Errors
| Status | Meaning |
|---|---|
400 | Bad request — missing required fields or validation failure |
401 | Unauthorized — missing or invalid API key |
402 | Insufficient credits or spend limit reached |
404 | Resource not found |
409 | Conflict — another operation is in progress for this resource |
422 | Unprocessable — valid request but can't be fulfilled (e.g. no publishers configured) |
