One engine. Every agent.

Not a chatbot — the platform that creates, tests, deploys and monitors AI agents. Everything hard is solved once in the engine. A new agent is a row in a table.

OrchestrationEvals & gatesKnowledge basesTools & MCPFull tracingShort & long-term memoryScheduled jobsModel · vendor · cloud agnostic

One engine, every agent — configured, not coded.

Why an engine

Every new agent used to be a new codebase. So we solved it once.

The trap

One agent after another

Every new agent meant hand-rolled tools, guardrails, evals and tracing — weeks per change.

The bet

Solve it once

Orchestration, RAG, tools, evals, memory and observability live once in the engine. Agents differ only by configuration.

A new agent is a row in a table
The agent lifecycle, managed end to end
1
Configure
Model, versioned prompts, typed I/O schemas, memory — edited in the portal, stored as data.
2
Knowledge
Upload sources, chunk & embed, upsert to the vector DB — scoped per agent, reindexed on schedule.
3
Tools & MCP
Create MCP servers with typed tools, or connect existing ones — decoupled from the prompt.
4
Data · NL→SQL
The business DB behind NL→SQL — read under the end-user JWT with full row-level security.
5
Eval set
Write or upload scenarios — a question, an expected answer, substring checks, a weighted rubric.
6
Evaluate & Gate
Suites run with an LLM judge; compare prompts, models and configs. Promotion blocks below the threshold.
7
Deploy
Compile-on-promote pins an immutable snapshot per environment: dev → staging → production.
8
Monitor
Every request traced in Laminar; tokens, latency and cost rolled up in ClickHouse. Scheduled evals on prod data.
9
Trace
Every turn is a replayable span tree; sampled live chats re-scored by a reference-free quality judge.
10
Audit
Every mutation and verdict in an append-only, immutable log — before/after captured, secrets masked.
Stage 1 · Configure a profile
Modelflash-2.0 · temp 0.2
Promptv12 · versioned, decoupled
Input / output schematyped · validated
Memorysession + long-term, scoped

Illustrative console — sample scores, real mechanism.

Under the hood

Every conversation enters the same runtime.

One generic Runtime Agent, driven by the active profile. Channels differ — the path is the same.

Channels every conversation enters the same runtime
Streaming Voice
STT / TTS
Web & Portal Chat
SSE
SMS
inbound
Outbound
campaigns
Four languages, full RTL — en · ar · tr · ru.
resolve & stream
tokens stream back over SSE
WAJ AI Engineone generic Runtime Agent, profile-driven — state, escalation and sub-agents built in.
Engine modules
Conversation Orchestrator
Drives the tool-calling loop for every agent — state, escalation and sub-agents, one generic runtime.
MCP Manager
Creates and connects MCP servers; enforces the per-profile tool whitelist.
Jobs Scheduler
Runs recurring background jobs — evals, KB refresh, translation, campaigns.
Memory
Session and long-term memory, scoped per agent — no leaks across conversations.
Observability
Captures every LLM call, tool run and retrieval as a replayable trace.
Knowledge / RAG
Bilingual embeddings and vector search, scoped and reindexed per agent.
Engine integrations
MCP
Tools & actions
39+ typed tools and external MCP servers the agent can call — whitelisted per profile.
Vector DB
Knowledge / RAG
Bilingual embeddings for retrieval — scoped per agent, reindexed on schedule.
Laminar
Tracing
Every LLM call, tool run and retrieval traced as a replayable span tree.
ClickHouse
Analytics
Tokens, latency and cost rolled up per agent for monitoring and evals.
service-role & end-user JWT paths
rows return, RLS-filtered
Data plane two isolated databases, the security & tenancy boundary
BUSINESS DB
Business data
End-user JWT; business reads under full row-level security via the user’s token.
ENGINE DB · STANDALONE
Engine data
All 22 config, eval, trace & chat tables. Service-role-only; the app enforces scoping.

Swap the model behind any agent in seconds and A/B it — no deploy.

Model-agnosticVendor-agnosticCloud-agnosticSwap any layer — no lock-in
Traces, knowledge & jobs

See every conversation, all the way down.

It captures what agents do, keeps their knowledge fresh, and schedules the work that keeps quality up.

Full-depth tracing

trace 8f2c…a91d · agent: business✓ 3.42s · 2,650 tok
agent.run
load_profile
llm.generate
tool.business_performance
retrieval.search
llm.generate
·
Every step captured each LLM call, tool run and function lands in a full trace tree.
·
Realtime & searchable watch traces live; full-text search over spans.
·
Replay & compare replay any step, swap prompt or model, compare.
·
Plain-language signals describe a behaviour in plain words; track it across production.
·
SQL over everything query traces, spans and costs with SQL; build dashboards.
·
Datasets from production turn real conversations into eval datasets.
OpenTelemetry-native · one line to instrument · re-rendered inside the admin portal

Knowledge base management

Sources are managed per agent: ingest documents, search bilingually, refresh on a schedule.

Doc ingestionPer-agent scopingBilingual semantic searchScheduled refresh & reindex

Job orchestrator

Recurring work runs as tracked jobs — evals on prod data, KB updates, bulk translation, campaigns.

SchedulingTracked runs & historyCooperative cancellationEvals on prod data

Memory inspector

Inspect short-term memory for a single session, or long-term memory built up across a user's history — per agent, per user.

Session (short-term)User (long-term)Per-agent scopingInspectable in the portal
One workspace

Everything runs from one admin portal.

The portal is the source of truth — edit a profile and the next request reflects it. All config-as-data.

  • Edit a profile in the portal — the next request reflects it, no deploy.
  • Run pytest and eval suites from the portal, with per-test breakdowns.
  • Browse and replay sessions with tool calls, metadata and sparklines.
  • Promote per environment from /admin/env — behind the eval gate.
/admin/agents
AgentModelEnvEvalStatus
businessflash-2.0prod0.94Live
recommenderflash-liteprod0.91Live
translatormini-4ostaging0.89Staging
clinicalsonnetprod0.96Live
website_editorflash-2.0devDraft

Sample data — the real portal is server-rendered and superuser-gated.

Where teams point it

Any agent your teams dream up.

The same loop — configure, evaluate, gate, trace — ships internal copilots and customer-facing agents alike. A sample of what fits.

Bug-response coding agent

Picks up bug reports and error alerts, reproduces the issue and drafts the fix as a pull request for human review.

repo + tracker tools · MCP

Business analyst

“Why did revenue dip in March?” — answered with SQL over the warehouse, under the asker’s permissions.

NL→SQL · RLS

How-to & policy assistant

Answers procedure and policy questions from internal docs, with the source cited every time.

RAG · citations

HR helpdesk

Resolves leave, payroll and onboarding tickets; hands sensitive cases to a human with full context.

tickets · escalation

Risk assessment

Screens cases against policy checklists and drafts a scored assessment for sign-off.

rubric scoring · audit

Report automation

Compiles the weekly ops report from live data and delivers it on schedule.

scheduled jobs · charts

Support agent

Deflects routine questions across chat and voice; escalates to humans with full transcript and sentiment.

voice + chat · handoff

Outbound campaigns

Runs retention and collections calls with approved scripts — consent and guardrails baked in.

campaigns · dialer

Renewal reminders

Chases license, contract and subscription renewals before they lapse — polite, persistent, logged.

scheduled outbound

Customer data assistant

“How did my store do this week?” — customers query their own numbers, inline charts included.

NL→SQL · charts

Booking concierge

Answers services, prices and availability on each tenant’s site, in that tenant’s voice.

multi-tenant RAG

Lead qualification

Greets inbound leads, qualifies them and books the meeting with your team.

inbound sales · calendar

Stop rebuilding the plumbing.

See the engine on your use case — profiles, gates, traces and costs, in one walkthrough.

Config, not code · eval-gated promotion · full traces & cost analytics · vendor, model & cloud agnostic