Skip to main content
Runtime architecture

AI assistant runtime for web, voice, and smart devices.

Assistant Core brings identity, prompt, model stack, voice pipeline, knowledge, memory, Tool/MCP, and device gateway into one unified runtime so assistants behave consistently across channels.

Web chat & browser voiceVoice WebSocket & MQTTKnowledge, memory, Tool/MCP
Assistant Core runtime diagram connecting web, apps, smart devices, voice, memory, knowledge base, and tools
5 layers
Intelligence, voice, tool, integration, and operations
1 assistant
One shared configuration across channels
Realtime
Streaming chat, voice, and device events

What runtime layers does it cover?

Instead of wiring disconnected services together, product teams get one assistant runtime that can be configured, operated, and extended per tenant.

Assistant intelligence

Each assistant has its own identity, system prompt, model stack, knowledge base, memory, and domain.

  • Per-assistant prompt and persona
  • RAG over private knowledge
  • Long-term memory retrieved by context

Voice runtime

A realtime voice pipeline handles microphone input through spoken output without pushing heavy reasoning onto devices.

  • VAD, ASR, LLM, and TTS in one flow
  • Browser voice for the web assistant
  • Speech-to-speech for screenless devices

Tool/MCP execution

Assistants can call APIs, built-in tools, MCP servers, or device-side tools to take action.

  • Controlled tool calling
  • MCP server and endpoint connections
  • Device-side capabilities through the gateway

Integration surface

The same assistant can appear in web, apps, embedded widgets, API clients, WebSocket voice, or IoT hardware.

  • Chat API for web and apps
  • Voice WebSocket for realtime audio
  • MQTT gateway for smart devices

Operations layer

Admins manage assistants, users, roles, devices, conversations, quotas, and operational state.

  • Per-assistant RBAC
  • Dashboard and conversation history
  • Device activation and owner binding

Multi-tenant foundation

Each assistant is its own runtime with isolated domain, configuration, data, and access control.

  • Request-level tenant isolation
  • Dedicated domains and branding
  • Cached assistant context by domain
Runtime architecture

From one assistant to many touchpoints

The runtime is designed so teams can start on web, then expand to voice, devices, and automation without recreating the assistant.

01

Create assistant and prompt

Define the assistant name, URL, system prompt, model stack, and brand voice.

02

Add knowledge and memory

Upload documents, websites, or internal data so answers stay grounded and important context is remembered.

03

Connect Tool/MCP

Let the assistant call APIs, workflows, business data, or capabilities exposed by connected devices.

04

Deploy web, voice, and device

Open the web assistant, embed the widget, use Voice WebSocket, or pair hardware through the MQTT gateway.

How is this different from a chatbot UI?

A chatbot UI usually covers the conversation surface. Assistant Core runtime manages prompt, models, voice pipeline, memory, knowledge base, Tool/MCP, device connections, roles, and operational observability.

Do smart devices need to run a local model?

Not necessarily. A device can keep the microphone, speaker, sensors, or actuators while the cloud runtime handles ASR, LLM reasoning, memory, tools, and TTS.

Can one assistant be shared across web and devices?

Yes. That is the main goal: one assistant can share prompt, knowledge, memory, tools, and policy across web assistant, browser voice, embedded widgets, and smart devices.