Senso
Sign in

Core Concepts

How the ingest → compile → query → generate pipeline works — the four stages of the Senso knowledge base.

Everything in Senso follows one pipeline: ingest raw sources, compile them into a verified knowledge base, query the compiled knowledge, and generate grounded content from it.

For organizing your ingested sources into folders, controlling access, and managing versions, see Knowledge Base.

---

Ingestion

Ingestion is how raw sources enter your knowledge base. There are two ways to ingest:

1. File ingest — a two-step presigned-URL flow for PDFs, DOCX, TXT, and other file types 2. Raw text — ingest content directly from text or markdown via the API

Both paths end the same way: a background worker parses the raw source, splits it into chunks, and compiles it into your org's vector store for querying.

Ingesting files

POST /org/kb/upload accepts metadata for up to 10 files and returns a presigned S3 URL for each.

import hashlib, os, requests

KEY  = os.environ["SENSO_API_KEY"]
BASE = "https://apiv2.senso.ai/api/v1"
HEADERS = {"X-API-Key": KEY, "Content-Type": "application/json"}

file_bytes = open("lending-policy.pdf", "rb").read()

resp = requests.post(f"{BASE}/org/kb/upload", headers=HEADERS, json={
    "files": [{
        "filename":         "lending-policy.pdf",
        "file_size_bytes":  len(file_bytes),
        "content_type":     "application/pdf",
        "content_hash_md5": hashlib.md5(file_bytes).hexdigest(),
    }]
})
result = resp.json()["results"][0]
print(result["content_id"], result["status"])
curl -X POST https://apiv2.senso.ai/api/v1/org/kb/upload \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "filename": "lending-policy.pdf",
        "file_size_bytes": 245760,
        "content_type": "application/pdf",
        "content_hash_md5": "d41d8cd98f00b204e9800998ecf8427e"
      }
    ]
  }'

Each file in the request needs four fields:

FieldTypeDescription
filenamestringOriginal filename
file_size_bytesintegerMust be >= 1
content_typestringMIME type (e.g. application/pdf)
content_hash_md5stringMD5 hex digest, exactly 32 characters
You can also specify kb_folder_node_id in the request body to ingest files into a specific folder. See Knowledge Base for details on organizing sources.

The response tells you what happened to each file:

{
  "summary": { "total": 1, "success": 1, "skipped": 0 },
  "results": [
    {
      "ingestion_run_id": "a1b2c3d4-...",
      "content_id": "e5f6a7b8-...",
      "filename": "lending-policy.pdf",
      "status": "upload_pending",
      "upload_url": "https://s3.amazonaws.com/...",
      "expires_in": 3600,
      "error": null,
      "existing_content_id": null
    }
  ]
}

StatusMeaning
upload_pendingReady — PUT the file to upload_url
conflictAnother ingestion run is active for this content
duplicateSame file appeared twice in this request
invalidMetadata failed validation
Important: A 200 response from the ingest endpoint means your request was accepted — NOT that compilation is complete. The response will show status: "upload_pending". You must poll GET /org/content/{id} until processing_status is "complete" before querying.

Uploading to S3

PUT the raw source to the presigned URL. No API key needed — the URL is pre-authenticated:

upload_url = result["upload_url"]
requests.put(upload_url, data=file_bytes)
curl -X PUT "https://s3.amazonaws.com/..." \
  --upload-file lending-policy.pdf

Once uploaded, a background worker compiles the raw source — parses it, splits it into chunks, generates vector embeddings, and indexes them. Poll GET /org/content/{id} until processing_status is "complete" before querying.

Ingesting raw text

If your raw source isn't in a file, you can ingest it directly with POST /org/kb/raw:

resp = requests.post(f"{BASE}/org/kb/raw", headers=HEADERS, json={
    "title": "Lending Policy FAQ",
    "text": "# Lending Policy FAQ\n\nQ: What is the maximum LTV?...",
})
print(resp.json()["id"])
curl -X POST https://apiv2.senso.ai/api/v1/org/kb/raw \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"title": "Lending Policy FAQ", "text": "# Lending Policy FAQ\n\n..."}'

Updating existing sources

To replace a file with a new version, use PUT /org/kb/nodes/{id}/file — same presigned-URL flow, same response shape, just targets an existing document instead of creating a new one. The knowledge base recompiles automatically.

To update raw text, use PUT /org/kb/nodes/{id}/raw (full replace) or PATCH /org/kb/nodes/{id}/raw (partial update). See Knowledge Base for details.

---

Querying

Three endpoints for querying your compiled knowledge base — same underlying vector search, different output shapes. All take the same request body:

{
  "query": "string (required)",
  "max_results": 5,
  "content_ids": ["uuid", "uuid"],
  "require_scoped_ids": false
}

FieldTypeDescription
querystringThe search query. Required.
max_resultsintegerMaximum number of chunks to return. Default: 5, max: 20.
content_idsuuid[]Restrict query to specific content items. Omit to query all compiled knowledge.
require_scoped_idsbooleanWhen true, only returns chunks from the specified content_ids. Default: false.

Full query — POST /org/search

Returns matching chunks plus an AI-generated answer grounded in your compiled knowledge base. Use this when you want a ready-made answer backed by sources.

resp = requests.post(f"{BASE}/org/search", headers=HEADERS, json={
    "query": "What is the maximum LTV for a home equity loan?",
    "max_results": 5,
})
data = resp.json()
print(f"Answer: {data['answer']}")
for r in data["results"]:
    print(f"  [{r['score']:.2f}] {r['title']}: {r['chunk_text'][:80]}...")
curl -X POST https://apiv2.senso.ai/api/v1/org/search \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the maximum LTV for a home equity loan?",
    "max_results": 5
  }'

Response:

{
  "query": "What is the maximum LTV for a home equity loan?",
  "answer": "The maximum loan-to-value ratio for a home equity loan is 85%...",
  "results": [
    {
      "content_id": "uuid",
      "version_id": "uuid",
      "chunk_index": 0,
      "chunk_text": "Maximum LTV of 85% applies to primary residences...",
      "score": 0.94,
      "title": "Home Equity Loan Policy v3",
      "vector_id": "string"
    }
  ],
  "total_results": 12,
  "max_results": 5,
  "processing_time_ms": 847
}

Context query — POST /org/search/context

Same vector search, but skips AI answer generation. Returns raw chunks directly. Use this when you want to feed compiled context into your own LLM or agent pipeline.

resp = requests.post(f"{BASE}/org/search/context", headers=HEADERS, json={
    "query": "home equity loan requirements",
    "max_results": 5,
})
data = resp.json()
for r in data["results"]:
    print(f"  [{r['score']:.2f}] {r['chunk_text'][:100]}...")
curl -X POST https://apiv2.senso.ai/api/v1/org/search/context \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "home equity loan requirements", "max_results": 5}'

Content query — POST /org/search/content

Deduplicates by content item and returns only IDs and titles. Use this when you just need to know which ingested sources are relevant.

resp = requests.post(f"{BASE}/org/search/content", headers=HEADERS, json={
    "query": "home equity loan requirements",
})
data = resp.json()
for c in data["contents"]:
    print(f"  {c['content_id']}: {c['title']}")
curl -X POST https://apiv2.senso.ai/api/v1/org/search/content \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "home equity loan requirements"}'

Streaming query — POST /org/search/stream

Same query as /org/search, but streams the answer as Server-Sent Events. Answer tokens arrive first, then sources after the answer completes.

Event sequence:

1. event: token (repeated) — individual answer tokens as they are generated 2. event: sources — search result chunks and metadata 3. event: done — stream complete

import sseclient  # pip install sseclient-py

resp = requests.post(f"{BASE}/org/search/stream", headers=HEADERS, json={
    "query": "What is the refund policy?",
}, stream=True)

client = sseclient.SSEClient(resp)
for event in client.events():
    if event.event == "token":
        print(event.data, end="", flush=True)
    elif event.event == "sources":
        print(f"\n\nSources: {event.data}")
    elif event.event == "done":
        break
curl -N -X POST https://apiv2.senso.ai/api/v1/org/search/stream \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the refund policy?"}'

---

Content generation

Note: Content generation can take 30–90 seconds depending on knowledge base size and complexity. If you hit a 504 timeout, the generation may still be processing server-side. We recommend starting with querying (which responds in under 1 second) and adding generation once your core agent loop is working.

Content generation queries your compiled knowledge base to produce new verified content — articles, FAQs, social posts — shaped by content types and grounded in your ingested raw sources.

Brand kit

Before generating content, set up your org's brand kit with PUT /org/brand-kit. The guidelines field is free-form JSON — these canonical fields are used by the content engine during generation:

resp = requests.put(f"{BASE}/org/brand-kit", headers=HEADERS, json={
    "guidelines": {
        "brand_name":           "Your Company",
        "brand_domain":         "yourcompany.com",
        "brand_description":    "What your company does in one sentence",
        "voice_and_tone":       "How you want the AI to write",
        "author_persona":       "Who the AI should write as",
        "global_writing_rules": ["Rule 1", "Rule 2"],
    }
})
print(resp.json()["brand_kit_id"])
curl -X PUT https://apiv2.senso.ai/api/v1/org/brand-kit \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "guidelines": {
      "brand_name": "Your Company",
      "brand_domain": "yourcompany.com",
      "brand_description": "What your company does in one sentence",
      "voice_and_tone": "How you want the AI to write",
      "author_persona": "Who the AI should write as",
      "global_writing_rules": ["Rule 1", "Rule 2"]
    }
  }'

The brand kit is org-wide — set it once and it applies to all content generation. See Brand Kit for the full reference.

Content types

A content type is a template that defines the kind of output you want. You create them with POST /org/content-types:

resp = requests.post(f"{BASE}/org/content-types", headers=HEADERS, json={
    "name": "FAQ Article",
    "config": {
        "template":       "A concise FAQ article under 800 words. One question, one clear answer.",
        "cta_text":       "Contact support",
        "cta_destination":"https://yourcompany.com/support",
        "writing_rules":  ["Use active voice", "Include one example"],
    }
})
print(resp.json()["content_type_id"])
curl -X POST https://apiv2.senso.ai/api/v1/org/content-types \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "FAQ Article",
    "config": {
      "template": "A concise FAQ article under 800 words. One question, one clear answer.",
      "cta_text": "Contact support",
      "cta_destination": "https://yourcompany.com/support",
      "writing_rules": ["Use active voice", "Include one example"]
    }
  }'

The config field is free-form JSON — the canonical fields above (template, cta_text, cta_destination, writing_rules) are used by the content engine during generation. Content types are reusable across any number of generation calls. See Content Types for the full reference.

Prompts

A prompt in Senso is a question you want AI to answer well about your organization — things like "What are the current mortgage rates?" or "How do I open a business account?". These are the questions that drive content generation.

Create prompts with POST /org/prompts:

resp = requests.post(f"{BASE}/org/prompts", headers=HEADERS, json={
    "question_text": "What are the current mortgage rates?",
    "type": "awareness",
})
prompt_id = resp.json()["prompt_id"]
curl -X POST https://apiv2.senso.ai/api/v1/org/prompts \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question_text": "What are the current mortgage rates?",
    "type": "awareness"
  }'

Types: awareness, consideration, decision, evaluation — reflecting where in the customer journey the question sits.

A note on naming: In the API, prompts are also called "GEO questions" (Generative Engine Optimization). You'll see geo_question_id in request/response fields — this is the same thing as a prompt ID. The content generation endpoints use geo_question_id because they originated from the GEO workflow, but conceptually it's just "the question you want answered."

Generating content

POST /org/content-generation/sample generates content for a specific question using a specific content type:

resp = requests.post(f"{BASE}/org/content-generation/sample", headers=HEADERS, json={
    "geo_question_id": prompt_id,
    "content_type_id": content_type_id,
})
gen = resp.json()
print(f"Title: {gen['seo_title']}")
print(f"Slug:  {gen['url_slug']}")
print(gen["raw_markdown"][:300])
curl -X POST https://apiv2.senso.ai/api/v1/org/content-generation/sample \
  -H "X-API-Key: $SENSO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "geo_question_id": "prompt-uuid",
    "content_type_id": "content-type-uuid"
  }'

The response is a complete content item ready for review:

{
  "content_id": "uuid",
  "version_id": "uuid",
  "version_num": 1,
  "raw_markdown": "# Current Mortgage Rates\n\nBased on our latest rate sheet...",
  "seo_title": "Current Mortgage Rates - Your Credit Union",
  "url_slug": "current-mortgage-rates",
  "editorial_status": "draft",
  "publish_status": "skipped"
}

Publishing and drafting

Once content is generated, you can publish it or save it as a draft for review:

  • POST /org/content-engine/publish — publishes to configured destinations (requires geo_question_id, raw_markdown, seo_title)
  • POST /org/content-engine/draft — saves as a draft for editorial review
  • You can also trigger a full generation run across multiple prompts with POST /org/content-generation/run, optionally scoped to specific prompt IDs.

    The generation pipeline

    The typical workflow:

    1. Ingest raw sources into your knowledge base
    2. Set up your brand kit (voice, persona, writing rules)
    3. Create content types (your output templates)
    4. Create prompts (the questions you want answered)
    5. Generate — the engine queries your compiled knowledge base and produces grounded content
    6. Review drafts, edit if needed
    7. Publish
    

    ---

    Content lifecycle

    Every content item — whether ingested or generated — has a lifecycle:

    Ingested raw sources move through a compilation pipeline:

    upload_pending → processing → complete
    

    Poll GET /org/content/{id} until processing_status is "complete" before querying against it.

    Generated content has an editorial workflow:

    draft → review → published (or rejected)
    

    Use GET /org/content/verification to list items awaiting review. Reject with POST /org/content/versions/{versionId}/reject, restore with POST /org/content/versions/{versionId}/restore.

    ---

    Errors

    StatusMeaning
    400Bad request — missing required fields or validation failure
    401Unauthorized — missing or invalid API key
    402Insufficient credits or spend limit reached
    404Resource not found
    409Conflict — another operation is in progress for this resource
    422Unprocessable — valid request but can't be fulfilled (e.g. no publishers configured)