Runtime architecture

AI assistant runtime for web, voice, and smart devices.

Assistant Core brings identity, prompt, model stack, voice pipeline, knowledge, memory, Tool/MCP, and device gateway into one unified runtime so assistants behave consistently across channels.

Web chat & browser voiceVoice WebSocket & MQTTKnowledge, memory, Tool/MCP

Create assistant Read runtime docs

Assistant Core runtime diagram connecting web, apps, smart devices, voice, memory, knowledge base, and tools

5 layers

Intelligence, voice, tool, integration, and operations

1 assistant

One shared configuration across channels

Realtime

Streaming chat, voice, and device events

What runtime layers does it cover?

Instead of wiring disconnected services together, product teams get one assistant runtime that can be configured, operated, and extended per tenant.

Assistant intelligence

Each assistant has its own identity, system prompt, model stack, knowledge base, memory, and domain.

Per-assistant prompt and persona
RAG over private knowledge
Long-term memory retrieved by context

Voice runtime

A realtime voice pipeline handles microphone input through spoken output without pushing heavy reasoning onto devices.

VAD, ASR, LLM, and TTS in one flow
Browser voice for the web assistant
Speech-to-speech for screenless devices

Tool/MCP execution

Assistants can call APIs, built-in tools, MCP servers, or device-side tools to take action.

Controlled tool calling
MCP server and endpoint connections
Device-side capabilities through the gateway

Integration surface

The same assistant can appear in web, apps, embedded widgets, API clients, WebSocket voice, or IoT hardware.

Chat API for web and apps
Voice WebSocket for realtime audio
MQTT gateway for smart devices

Operations layer

Admins manage assistants, users, roles, devices, conversations, quotas, and operational state.

Per-assistant RBAC
Dashboard and conversation history
Device activation and owner binding

Multi-tenant foundation

Each assistant is its own runtime with isolated domain, configuration, data, and access control.

Request-level tenant isolation
Dedicated domains and branding
Cached assistant context by domain

Runtime architecture

From one assistant to many touchpoints

The runtime is designed so teams can start on web, then expand to voice, devices, and automation without recreating the assistant.

Create assistant and prompt

Define the assistant name, URL, system prompt, model stack, and brand voice.

Add knowledge and memory

Upload documents, websites, or internal data so answers stay grounded and important context is remembered.

Connect Tool/MCP

Let the assistant call APIs, workflows, business data, or capabilities exposed by connected devices.

Deploy web, voice, and device

Open the web assistant, embed the widget, use Voice WebSocket, or pair hardware through the MQTT gateway.

Runtime FAQ

Explore next

Use cases

See how the runtime applies to SaaS, businesses, creators, and smart devices.

Security

See how the runtime handles auth, RBAC, tenant isolation, and logging.

Technical blog

Read more about RAG, MQTT gateway, and multi-LLM architecture.

How is this different from a chatbot UI?

A chatbot UI usually covers the conversation surface. Assistant Core runtime manages prompt, models, voice pipeline, memory, knowledge base, Tool/MCP, device connections, roles, and operational observability.

Do smart devices need to run a local model?

Not necessarily. A device can keep the microphone, speaker, sensors, or actuators while the cloud runtime handles ASR, LLM reasoning, memory, tools, and TTS.

Can one assistant be shared across web and devices?

Yes. That is the main goal: one assistant can share prompt, knowledge, memory, tools, and policy across web assistant, browser voice, embedded widgets, and smart devices.