Platform Architecture

Enterprise-grade components for building complex AI systems

Multi-Agent Orchestration

Bot-to-bot routing with automatic detection. An orchestrator dispatches requests to specialist bots based on context. Cross-user support with revenue sharing.

RAG Pipeline

Retrieval-Augmented Generation on pgvector. Adaptive chunking (character, paragraph, sentence), HyDE for short queries, Jina cross-encoder reranking for multilingual precision.

30+ Models, 7 Providers

Claude, GPT-4, Gemini, Groq, Mistral, Zhipu, Qwen. Each bot picks its own model. Transparent credit-based pricing, or bring your own API key.

Knowledge Base & Documents

Upload PDF and DOCX, parsed via LlamaParse with semantic Markdown extraction. Embeddings indexed with Jina v3 (768-dim, 89 languages) or Google text-embedding-004.

REST API & Widget SDK

40+ REST endpoints with JWT and API key auth. Embeddable JavaScript widget with domain whitelisting, 7 themes, custom CSS, and context variables.

Analytics & Monitoring

Dashboard with geolocation, device distribution, session history, and conversation summaries. 3-level RAG debug with cosine scores and chunk metadata.

How the Pipeline Works

From user message to response: what happens under the hood

Ingestion & Embedding

Upload documents (PDF, DOCX). LlamaParse extracts content into Markdown. Text is split into chunks (by character, paragraph, or sentence) and indexed as 768-dim vectors on pgvector.

Retrieval & Routing

The message is transformed into an embedding. Cosine similarity against indexed chunks. For short queries, HyDE generates a hypothetical document to improve matching. Jina reranker reorders results. If specialist bots are connected, semantic routing activates them automatically.

Generation & Response

Relevant chunks are injected into the system prompt. The selected AI model (Claude, GPT-4, Gemini, Groq, Mistral) generates the response. Analytics logs session, device, geolocation, and conversation summary.

Available AI Models

7 providers, 30+ models. Each bot picks its own. Bring your API key or use platform credits.

Anthropic

Claude Opus, Sonnet, Haiku

OpenAI

GPT-4.1, GPT-4o, o1

Google

Gemini 2.5 Flash & Pro

Groq

Ultra-fast inference

Mistral

Large & Small, EU-hosted

Zhipu & Qwen

GLM and Qwen for APAC

Multi-Agent AI Platform