Add a New AI Provider
This is the current implementation guide for adding a provider to TomoriBot.
Read docs/pipelines/provider/ first if you need the architecture overview.
Use this guide when you are actually wiring a new provider into the codebase.
Mental Model
Section titled “Mental Model”TomoriBot has a real provider abstraction for core chat behavior, but provider integration is not “drop a folder and you’re done”.
Today, adding a provider usually means all of the following:
- Add the provider implementation under
src/providers/{providerName}/. - Add static provider metadata in
providerInfo.ts. - Seed provider model inventory in the database.
- If the provider supports app-level runtime features beyond core chat, implement the relevant optional capability interfaces on the provider class.
Important rules:
ProviderFactoryauto-discovers provider classes.providerInfoRegistryauto-discoverssrc/providers/*/providerInfo.tsmetadata and provider-owned feature implementation declarations.- Text model defaults come from the database/cache, not from hardcoded model arrays in provider code.
- If a command only needs to know whether a provider supports a feature, use
providerSupportsFeature(). - If a command needs provider-specific runtime execution, resolve a provider-owned capability instead of hardcoding
provider === "google"style checks.
1. Decide Scope First
Section titled “1. Decide Scope First”Before you write code, decide whether the new provider is:
- chat-only
- chat + tool calling
- full parity with other providers for images, embeddings, structured output, compaction, preset generation, or history extraction
Do not mark features as supported unless they work end-to-end in the app.
The source of truth for those app-level capabilities is ProviderInfo.featureSupport in src/types/provider/interfaces.ts.
2. Create the Provider Folder
Section titled “2. Create the Provider Folder”Create:
src/providers/{providerName}/providerInfo.tssrc/providers/{providerName}/{providerName}Provider.tssrc/providers/{providerName}/{providerName}StreamAdapter.tssrc/providers/{providerName}/{providerName}ToolAdapter.ts
You can add extra helper files in the same folder as needed.
Use existing providers as references:
src/providers/google/src/providers/openrouter/src/providers/novelai/src/providers/custom/
If the vendor exposes an OpenAI-style chat/completions API, read the OpenAI-Compatible Providers section below before cloning provider code.
3. Define Static Provider Metadata
Section titled “3. Define Static Provider Metadata”Create providerInfo.ts and export a ProviderInfo object.
Example:
import type { ProviderInfo } from "@/types/provider/interfaces";
export const exampleProviderInfo: ProviderInfo = { name: "example", displayName: "Example AI", aliases: ["ex"], supportedModels: [], requiresApiKey: true, supportsStreaming: true, supportsFunctionCalling: true, supportsImages: true, supportsVideos: false, apiFamily: "openai-compatible", featureSupport: { imageGeneration: "none", videoGeneration: "none", embeddings: false, structuredOutput: false, presetGeneration: false, expressionInitialization: false, liveTokenCounting: false, conversationCompaction: false, historyExtraction: false, }, supportedParams: ["temperature", "topP"] as const,};Notes:
supportedModelsis usually[]because the database is the source of truth for model inventory.apiFamilyshould describe the underlying API surface, not the marketing name.featureSupportshould reflect app-level support, not just raw vendor API capability.featureImplementationsis optional. Use it only when this provider routes a shared runtime feature through an existing implementation key, such asimageGeneration,videoGeneration, orliveTokenCounting.usageCostMode: "none"is optional metadata for providers where/tool estimate costshould not report per-token usage charges.
4. Implement the Provider Class
Section titled “4. Implement the Provider Class”Implement {ProviderName}Provider by extending BaseLLMProvider.
Required methods live in src/types/provider/interfaces.ts:
getInfo()validateApiKey()formatErrorDescription()getTools()createConfig()streamToDiscord()getDefaultModel()
Recommended pattern:
- import and return the object from
providerInfo.tsinsidegetInfo() - keep provider-specific API key validation and user-facing error formatting inside the provider
- keep config conversion inside
createConfig() - keep streaming behavior inside the provider stream adapter
formatErrorDescription() is important. Provider-specific error formatting should stay inside the provider abstraction instead of being reimplemented in commands.
5. Implement Stream and Tool Adapters
Section titled “5. Implement Stream and Tool Adapters”Your stream adapter should normalize provider responses into TomoriBot’s streaming pipeline.
Your tool adapter should convert TomoriBot tool schemas and tool results into the provider’s function-calling format.
Use the existing adapters as the implementation pattern:
src/providers/google/googleStreamAdapter.tssrc/providers/google/googleToolAdapter.tssrc/providers/openrouter/openrouterStreamAdapter.tssrc/providers/openrouter/openrouterToolAdapter.ts
Important:
- provider tool conversion is not fully generic across vendors
- if you add nested tool-schema support or other serializer behavior, update all provider tool adapters consistently
- image/video placeholder contract — do not silently skip image or video
parts when
seesImages/seesVideosis false. The context pipeline may embed media parts even for non-vision primary models when a fallback model in the chain supports that media type. When your adapter encounters a part it cannot render, push a{ type: "text", text: "[System: An image/video is attached to this message that this model cannot process.]" }entry so the model is still aware the media exists. SeeopenaiCompatibleMessageBuilder.tsandopenrouterStreamAdapter.tsfor the reference pattern.
Reasoning output contract
Section titled “Reasoning output contract”If your provider exposes human-displayable reasoning, emit it through the normalized streaming contract instead of leaking it into visible reply text.
- Put displayable reasoning on
ProcessedChunk.thoughts - use
kind: "summary"for concise provider-supplied summaries - use
kind: "raw"for raw but human-readable reasoning text - if the provider hides reasoning inside tags such as
<think>...</think>, strip it from visible output and surface the inner text asrawthoughts
Keep replay-only continuity fields internal:
- Gemini
thoughtSignature - OpenRouter
reasoning_details - DeepSeek continuation
reasoning_content - any vendor-specific hidden token block that is needed only for tool-call continuation
Those fields may still need to be preserved in provider-specific tool replay payloads, but they must not be exposed through thought-log output or normal Discord message flushing.
5.5 Register the MCP Tool Adapter
Section titled “5.5 Register the MCP Tool Adapter”If your provider supports tool calling, you must register its tool adapter with
registerMCPAdapter() in src/events/clientReady/02_registerMCPs.ts.
Example:
import { getExampleToolAdapter } from "../../providers/example/exampleToolAdapter";
const exampleAdapter = getExampleToolAdapter();registerMCPAdapter(exampleAdapter);log.info("Registered Example tool adapter with MCP capabilities");Why this matters:
- Tool definitions are built by
getAvailableToolsWithMCP(), which queries themcpManagerdirectly. This works for any provider regardless of adapter registration. - Tool execution goes through
executeTool()→isMCPFunction(name, provider), which looks up the provider’s adapter inToolRegistry.mcpAdapters. - If the adapter is not registered, MCP tools (e.g.
fetch,web-search) will appear in the LLM’s tool list but fail at execution time with “Tool not found in registry”. Built-in tools still work because they resolve through a separate code path.
This is easy to miss because the provider will appear to work for built-in tool calls. The failure only surfaces when the LLM calls an MCP-backed tool.
Guild MCP Tools
Section titled “Guild MCP Tools”In addition to global MCP tools, each guild can register its own remote MCP servers
via /config mcp add. Your tool adapter’s getAllToolsIn*Format() method must
also inject guild MCP tools by calling getGuildMcpManager().getGuildMCPTools(serverId).
The pattern (used by Google, OpenRouter, and OpenAI-compatible adapters):
- After adding global MCP tools, check
if (serverId && allowedMCPFunctions) - Call
guildMcpManager.getGuildMCPTools(serverId)to get per-guildCallableTools - Filter declarations against
allowedMCPFunctions(pre-approved bytoolRegistry) - Convert to your provider’s format and append
If this step is missing, guild-registered MCP tools will be discovered and logged but never sent to the LLM — the model won’t know they exist.
6. Verify Provider Metadata Discovery
Section titled “6. Verify Provider Metadata Discovery”ProviderFactory auto-discovers {providerName}Provider.ts, and providerInfoRegistry auto-discovers providerInfo.ts.
The metadata registry powers:
- alias normalization
- capability checks via
providerSupportsFeature() - temporary runtime execution routing via
resolveProviderFeatureImplementation()
If providerInfo.ts is missing or does not export a valid ProviderInfo object, the provider may stream chat successfully but still behave as unsupported in feature-gated commands.
7. Implement Optional Runtime Capabilities When Needed
Section titled “7. Implement Optional Runtime Capabilities When Needed”Some app features execute outside the main chat streaming path.
Current capability entry points:
src/types/provider/featureInterfaces.tssrc/utils/provider/providerCapabilityResolver.tssrc/providers/utils/providerFeatureExecutors.ts
Current provider-owned examples include:
- embeddings
- structured output execution
- preset generation
- conversation compaction
- roleplay compaction
- expression initialization
- history extraction
Use this rule:
- If
featureSupport.{feature}isfalse, no extra wiring is needed. - If
featureSupport.{feature}istrueand the app executes that feature at runtime, implement the matching optional capability on the provider class. - If a helper is truly generic, keep it in a shared utility. If it contains provider-specific HTTP calls, prompt shaping, or response parsing, keep it in
src/providers/{providerName}/.
Do not scatter exact provider-name checks across commands. Put routing in the provider capability layer.
8. Seed Model Inventory
Section titled “8. Seed Model Inventory”Models live in a typed catalog and are seeded into the database directly from
it at startup — there is no model seed .sql file. Add the provider’s rows to
src/db/seed/catalog/models.ts as named-field objects (every capability flag is
optional and defaults to false, so you only list the ones that are true). That
is the whole change — nothing to regenerate or compile.
{ provider: "google", codename: "gemini-2.5-flash", isDefault: true, hasTools: true, seesImages: true, seesVideos: true, seesYoutube: true, supportsStructoutput: true, desc: "Balanced model…", ja: "汎用…" },How it runs: seedModelsFromCatalog() (in src/db/seed/catalog/modelSeed.ts) renders
each section into one INSERT … ON CONFLICT per table and executes it during
database initialization, before the persona, system prompt, and NovelAI preset
catalog seeders. Seeding is an idempotent upsert on every
startup, exactly as the old 01_models.sql was. To deprecate a model, set
isDeprecated: true on its row; to remove one, delete its row.
Tables currently modeled in the catalog:
llmSections→llmsfor text/chat modelsimageSections→image_diffusion_modelsfor native image generation modelsvideoSections→video_generation_modelsfor native video generation modelsembeddingSections→embedding_modelsfor embedding providers/models
Use the tables that match the features you actually support.
Examples:
- a chat-only provider needs
llms - a provider with native image generation also needs
image_diffusion_models - a provider with embeddings also needs
embedding_models
Do not hardcode default models in provider code when the app already resolves them from the database/cache.
Invariants (validated at startup and by bun run check-seed-catalogs)
Section titled “Invariants (validated at startup and by bun run check-seed-catalogs)”seedModelsFromCatalog() validates the catalog and throws before any DB write
if an invariant is violated, so a malformed catalog fails fast on boot. The same
check is available offline via bun run check-seed-catalogs, which now validates every
typed seed catalog (models, personas, system prompts, and NovelAI presets):
- exactly one
isDefaultper provider, and that default is not deprecated; - at least one
isSmartestper provider inllms, with at least one not deprecated; - unique
(provider, codename)within each table.
The custom provider is exempt (its custom/bootstrap row is configured per-server).
8.5 Update User-Facing Help
Section titled “8.5 Update User-Facing Help”When adding a provider, update the user-facing setup/help copy in the same change.
Minimum reminders:
- update
/help api-keyprovider choices insrc/commands/help/api-key.ts - add localized
/help api-keycopy in both locale trees (src/locales/en-US/andsrc/locales/ja/) - review the
/config paramssuccess embed strings in both locale trees - keep those
/config paramsprovider lists accurate per parameter; do not add a provider unless that exact saved setting is wired through the provider runtime - if the provider changes onboarding guidance, also review
/help setup - if the provider changes pricing guidance or model-tag expectations, review
/help costand any related help text
9. Keep New Logic Inside the Provider Layer
Section titled “9. Keep New Logic Inside the Provider Layer”When integrating provider-specific behavior:
- prefer
providerInfo.ts - prefer provider class capability methods
- prefer
providerCapabilityResolver.tsand thin shared wrappers such asproviderFeatureExecutors.ts - keep provider helpers inside
src/providers/{providerName}/ - use shared provider utilities only for code that is genuinely cross-provider
Current structured-output examples:
src/providers/google/googleStructuredOutput.tssrc/providers/openrouter/openrouterStructuredOutput.tssrc/providers/deepseek/deepseekStructuredOutput.tssrc/providers/zai/zaiStructuredOutput.ts
Avoid adding new command-level checks like:
if (providerName === "example") { // ...}Prefer:
providerSupportsFeature(providerName, "structuredOutput")resolveProviderCapability(providerName, "presetGeneration")- provider-local code paths behind shared executor helpers
Literal provider names are still correct when the behavior is truly vendor-specific, such as an optional credential that only exists for one vendor integration.
10. Verify Vendor Capability Contracts
Section titled “10. Verify Vendor Capability Contracts”Before you mark a feature as supported, verify the vendor’s exact request/response contract instead of relying on the provider being “OpenAI-compatible”.
Checklist:
- tool calling:
- confirm which seeded models actually support tools
- confirm the assistant/tool loop shape matches TomoriBot’s runtime
- reasoning or thinking mode:
- check for continuation-only fields that must be replayed within the same turn
- decide explicitly how the
thinking_leveloption in/config samplersbehaves for the provider: map it, or document a deliberate no-op - DeepSeek example: preserve
reasoning_contentacross tool sub-turns, but do not treat it as normal cross-turn chat history - Z.ai example: the
thinking_leveloption in/config samplersmaps tothinking: { type: "enabled" | "disabled" }; active thinking removes temperature / top_p / frequency_penalty / presence_penalty
- structured output:
- determine whether the provider offers strict schema mode or only JSON-object mode
- implement provider-owned
callStructuredJSON()for the contract the vendor actually supports - only seed
supports_structoutput = trueafter end-to-end validation
- assistant prefill or prefix completion:
- if TomoriBot uses assistant prefills, verify whether the vendor requires a beta endpoint, a trailing assistant message, or a message flag such as
prefix: true
- if TomoriBot uses assistant prefills, verify whether the vendor requires a beta endpoint, a trailing assistant message, or a message flag such as
- live cost estimation:
- if
/tool estimate costshould support the provider, add a minimal non-streaming token probe - parse prompt token usage from the vendor response
- add env-backed pricing defaults and document where those defaults came from
- if
- unsupported parameters:
- check whether reasoning models ignore or reject parameters like
temperature,top_p, or logprob settings - strip or avoid unsupported parameters inside the provider layer
- check whether reasoning models ignore or reject parameters like
Rule:
- vendor capability claims are not the same thing as TomoriBot app-level support
ProviderInfo.featureSupportshould describe the runtime TomoriBot has actually implemented
11. Capability Implementation Reminders
Section titled “11. Capability Implementation Reminders”Use this as the last pass before you call a provider integration “done”.
Tool Calling
Section titled “Tool Calling”- seed
llms.has_toolsconservatively per model instead of assuming all chat models support tools - implement or reuse the provider tool adapter and confirm schema serialization matches the vendor contract
- verify the full assistant -> tool call -> tool result -> assistant continuation loop, not just first-turn tool emission
- if the vendor has special reasoning-mode tool requirements, keep that logic in the provider layer
- if strict tool-schema mode exists, treat it as optional follow-up work unless the runtime is fully validated
Structured Output
Section titled “Structured Output”- determine whether the vendor supports strict schema mode, JSON-object mode, or only prompt-steered JSON
- implement provider-owned
callStructuredJSON()for the actual supported contract
Thinking Level
Section titled “Thinking Level”- if the vendor has a verified request-side reasoning control, map the
thinking_leveloption from/config samplersin the provider layer - if the vendor only supports startup flags, GUI toggles, or backend-template-specific reasoning controls, do not invent a generic request field
- document the result in
docs/subsystems/thinking-level.mdand the provider notes - if the provider only guarantees JSON objects, inject the required prompt guidance and validate locally with Zod
- seed
llms.supports_structoutputonly for models validated end-to-end in TomoriBot - if history extraction depends on structured output, do not enable
featureSupport.historyExtractionuntil that path works
Live Cost Estimation
Section titled “Live Cost Estimation”- if the provider should support
/tool estimate cost, add a minimal non-streaming prompt-token probe - parse prompt token usage from the vendor response instead of estimating locally when the API exposes usage
- keep pricing source explicit:
- use live/cached provider pricing when available
- otherwise use env-backed defaults with documented provenance
- do not quietly hardcode marketing-page assumptions without making them overrideable
- document any caveats such as cache-hit vs cache-miss pricing, account-default models, or unsupported token-count endpoints
Native Image Generation
Section titled “Native Image Generation”- only implement this if the app already has a native image-generation path for the provider
- implement the provider-owned image-generation capability instead of faking it through text chat
- seed
image_diffusion_modelsonly for models that are actually wired and tested - confirm
/model imageand any image-generation commands use the provider cleanly - if the provider has no native image generation, leave
featureSupport.imageGeneration = "none"and do not seed image rows
Embedding Models
Section titled “Embedding Models”- only implement embeddings if the provider has a real embedding API path wired into TomoriBot
- implement the provider-owned embedding capability and seed
embedding_models - verify any provider-specific dimensionality, batching, or input limits
- confirm embedding selection/config flows behave correctly for the provider
- if embeddings are not implemented, leave
featureSupport.embeddings = falseand do not seed embedding rows
Conversation Compaction
Section titled “Conversation Compaction”- create
src/providers/{providerName}/compactGenerator.tswith two exports:generateConversationSummary{Provider}(request)— plain text POST to the provider chat endpointgenerateRoleplaySummary{Provider}(request)— structured output POST using the provider’scallStructuredJSON()helper
- import
buildRoleplaySchema()andCompactRoleplaySummarySchemafromsrc/providers/utils/compactCommon.tsinstead of duplicating them - for plain text summary: omit
response_format; just request a text completion using the provider’s chat completions endpoint - for reasoning-model providers that reject
temperature, guard the parameter before sending (e.g. DeepSeek-reasoner, ZAI GLM reasoning models) - implement
SupportsConversationCompactionon the provider class and wire both methods - set
featureSupport.conversationCompaction: trueinproviderInfo.ts - if the provider needs image preprocessing before sending to the compaction endpoint (e.g. NVIDIA), apply it inside
generateConversationSummary{Provider}()
Preset Generation
Section titled “Preset Generation”- create
src/providers/{providerName}/presetGenerator.tswith one export:generatePresetFromPrompt{Provider}(apiKey, params, locale, options)— structured output POST with an optional tool-calling loop
- import
buildPresetResponseSchema(),buildPresetPrompt(),buildToolErrorResult(), and the shared types (PresetContentPart,PresetMessage,PresetToolCall) fromsrc/providers/utils/presetCommon.ts - choose the right structured output mode for the vendor:
- strict schema (
json_schemaresponse format): preferred when the vendor supports it (e.g. OpenRouter, NVIDIA); validate the response shape locally if needed - JSON object mode (
json_objectresponse format): required for vendors that only support JSON-mode (e.g. DeepSeek, ZAI); inject the JSON schema into the system prompt via a helper likebuild{Provider}PresetSystemPrompt()and run Zod validation locally - schema fallback (try strict, retry with JSON object on 400/422): useful when strict schema support is model-dependent (e.g. NVIDIA)
- strict schema (
- implement the tool-calling loop:
- build tool definitions via
getPresetGenerationTools()(private helper that callsgetAvailableToolsWithMCP()andisBraveSearchAvailable()) - if the model returns
tool_calls, execute each tool, append results, and loop up tooptions.maxToolRounds - use
buildToolErrorResult()for failed tool executions
- build tool definitions via
- for reasoning-model providers that reject
temperature, guard the parameter before sending - implement
SupportsPresetGenerationon the provider class and wiregeneratePreset() - set
featureSupport.presetGeneration: trueinproviderInfo.ts - if two providers share an endpoint family (e.g. ZAI and ZAI Coding), parameterize the generator by
endpointUrl?andtoolAdapter?so the secondary provider can delegate without code duplication
12. Test the Integration
Section titled “12. Test the Integration”Minimum test checklist:
- provider is auto-discovered at startup
- aliases resolve correctly
/config provider addand/config provider switchvalidation work/config setupand provider-specific error formatting work/model textshows the provider’s seeded models- normal chat streaming works
- tool calling works if supported
- structured output works if supported
- history extraction works if supported
- reasoning-mode tool continuation works if the vendor requires replay fields
- assistant-prefill behavior works if the provider supports native prefix completion
/tool estimate costworks iffeatureSupport.liveTokenCounting = true- unsupported features fail cleanly instead of throwing deep runtime errors
- feature-gated commands behave correctly for your
featureSupportvalues
Validation commands:
bun run checkbun run lintRun bun run check-locales only if you changed locale files or command metadata.
Common Mistakes
Section titled “Common Mistakes”- Forgetting
providerInfo.tsand trying to keep metadata only insidegetInfo() - Marking a feature as supported without wiring its shared executor path
- Marking a feature as supported without implementing the matching provider capability
- Assuming provider-folder auto-discovery also handles static metadata registration
- Hardcoding model lists in code instead of using the database-backed inventory
- Adding new provider checks directly in commands instead of the provider capability layer
- Treating vendor API capability as the same thing as app-level support
- Forgetting to register the tool adapter with
registerMCPAdapter()in02_registerMCPs.ts— MCP tools will be sent to the LLM but fail at execution time (see step 5.5)
Related Files
Section titled “Related Files”docs/pipelines/provider/(provider pipeline architecture reference)src/types/provider/interfaces.tssrc/types/provider/featureInterfaces.tssrc/utils/provider/providerFactory.tssrc/utils/provider/providerInfoRegistry.tssrc/utils/provider/providerCapabilityResolver.tssrc/providers/utils/providerFeatureExecutors.tssrc/events/clientReady/02_registerMCPs.tssrc/db/seed/catalog/models.ts(typed model catalog — edit this)src/db/seed/catalog/types.ts(catalog input types)src/db/seed/catalog/modelSeed.ts(renders + validates the catalog and seeds it at startup)scripts/checks/checkSeedCatalogs.ts(bun run check-seed-catalogs— offline seed-catalog invariant check)
OpenAI-Compatible Providers
Section titled “OpenAI-Compatible Providers”This section is the concrete refactor blueprint for adding multiple new providers that expose an OpenAI-style chat API.
Why This Exists
Section titled “Why This Exists”TomoriBot already has one practical OpenAI-compatible provider: src/providers/custom/. It also has another provider that speaks an OpenAI-like message shape but is not a good generic base: src/providers/openrouter/.
custom is the highest-ROI extraction starting point for vendors such as DeepSeek, Z.ai, and some NVIDIA NIM chat endpoints.
It is not the right base for:
- Vertex AI — auth/config is Google Cloud-oriented rather than simple API-key + base URL
- Codex CLI — better treated as a local tool/client integration than an
LLMProvider
Current Reality
Section titled “Current Reality”The useful split in TomoriBot is:
ProviderInfo.featureSupport: app-level feature supportllmsrows: per-model chat/runtime capability flags (has_tools,sees_images,supports_structoutput)image_diffusion_models: native image generation inventoryembedding_models: embedding inventory
A provider can be chat-only, chat + tools, chat + tools + vision, chat + embeddings, or chat + native image generation without pretending it supports everything.
Important existing behavior: setup/provider switching already tolerates providers with no image models or embedding models by storing NULL in server_model_configs.diffusion_model_id and server_model_configs.embedding_model_id.
Refactor Goal
Section titled “Refactor Goal”The goal is not to make every new provider inherit custom directly. The goal is to extract a reusable OpenAI-compatible family layer from custom, then let multiple concrete providers consume it.
First target consumers: custom, deepseek, zai, nvidia.
Deferred or separate work: openrouter migration into the shared family, vertex provider, codex-cli integration.
Non-Goals
Section titled “Non-Goals”Do not try to solve all of this in the first extraction:
- OpenRouter capability probing and parameter-drop retry logic
- OpenRouter reasoning block preservation
- OpenRouter assistant-image role workaround
- Google/Vertex auth flows
- cross-provider native image generation routing
Recommended Extraction Boundary
Section titled “Recommended Extraction Boundary”Extract the stream/tool/message-format layer first. Do not start by forcing a single abstract provider class for everything.
The high-value shared pieces are:
- OpenAI-style message assembly from
StructuredContextItem[] - OpenAI-style tool schema conversion
- OpenAI-style SSE stream parsing
- streamed tool-call accumulation
- shared image-part conversion for providers that accept
image_url - shared sanitized request logging
- baseline OpenAI-compatible HTTP error parsing
Keep these provider-owned: providerInfo.ts, base URL, auth header shape, API key validation strategy, provider display name/aliases, provider-specific locale namespace, request parameter policy, feature flags in ProviderInfo.featureSupport, optional runtime capability implementations, and provider-specific structured-output/image-generation/embedding/cost helpers that make vendor-specific HTTP requests.
User-facing reminder: when adding a new provider in this family, also update /help api-key choices and localized provider instructions.
File Layout
Section titled “File Layout”Recommended new shared folder:
src/providers/openaiCompatible/ openaiCompatibleTypes.ts openaiCompatibleMessageBuilder.ts openaiCompatibleSse.ts openaiCompatibleErrorFormatter.ts openaiCompatibleStreamAdapter.ts openaiCompatibleToolAdapter.tsResponsibility split:
openaiCompatibleTypes.ts— shared chunk/tool types; provider-family options (providerName,endpointUrl,supportsVision,supportsVideos)openaiCompatibleMessageBuilder.ts— convert Tomori context into OpenAI chat messages, handle image parts, sanitized logging helperopenaiCompatibleSse.ts— read SSE lines, parsedata:payloads, normalize[DONE]openaiCompatibleErrorFormatter.ts— baseline HTTP/OpenAI-style error parsing, retryable vs non-retryable helpersopenaiCompatibleStreamAdapter.ts— sharedStreamProviderimplementation, tool-call accumulation, finish-reason handlingopenaiCompatibleToolAdapter.ts— generic OpenAI function schema conversion, global and guild MCP tool injection
Concrete providers then stay small:
src/providers/custom/ customProvider.ts customStreamAdapter.ts customToolAdapter.ts providerInfo.ts
src/providers/deepseek/ deepseekProvider.ts deepseekStreamAdapter.ts deepseekToolAdapter.ts providerInfo.ts
src/providers/zai/ zaiProvider.ts zaiStreamAdapter.ts zaiToolAdapter.ts providerInfo.tsWhat To Extract From custom
Section titled “What To Extract From custom”Good first extraction candidates from customStreamAdapter.ts: OpenAI chunk type definitions, SSE parsing loop, message assembly, tool-call accumulation, finish-reason handling, sanitized request logging, shared image-to-image_url conversion.
Good first extraction candidates from customToolAdapter.ts: OpenAI function declaration shape, generic schema cloning, tools array conversion, MCP function export path, common tool-result formatting.
What Not To Extract From openrouter Yet
Section titled “What Not To Extract From openrouter Yet”src/providers/openrouter/ is useful as a reference but should stay separate in phase 1. It has provider-specific capability cache usage, parameter probe-drop retry logic, reasoning_details preservation, assistant-turn image role rewriting, and stricter request shaping for upstream vendors.
Capability Rules For New Providers
Section titled “Capability Rules For New Providers”When a vendor lacks a feature, do not emulate support unless the app path is actually wired:
- No native image generation →
featureSupport.nativeImageGeneration = false, noimage_diffusion_modelsrows - No embeddings →
featureSupport.embeddings = false, noembedding_modelsrows - Only some models support tools/vision/structured output → keep provider-level flags broad only when true in principle; gate at
llmsper-model flags - Vendor API capability but no TomoriBot runtime path → keep
featureSupportflagfalse
Provider Mapping
Section titled “Provider Mapping”DeepSeek — chat streaming, tool calling (where seeded), thinking-mode tool continuation if reasoning_content replay is wired, JSON structured output only if validated. Do not assume embeddings or native image generation.
Z.ai — chat streaming, tool calling, vision if seeded models support it, structured output if validated. Possible second pass: embeddings, native image generation.
NVIDIA NIM — curated text/chat models only; tool calling and vision on seeded rows only; structured output on validated NVIDIA subset; provider-owned embeddings via nv-embed-v1; native image generation via NVIDIA’s Stability endpoint. Keep treating NVIDIA as a curated catalog, not a blanket claim.
Vertex AI — do not place in this family. Even with Vertex’s OpenAI-compatible endpoint, the project needs a Google Cloud auth/config story that doesn’t fit TomoriBot’s current credential model.
Codex CLI — do not implement as an LLMProvider. If pursued: local tool integration, MCP server bridge, or separate OpenAI API provider for coding models.
Rollout Order
Section titled “Rollout Order”Phase 1: Extract Shared Family Helpers — create src/providers/openaiCompatible/, move common helpers, keep behavior identical for custom.
Phase 2: Migrate custom — make custom the first consumer; verify no behavior change.
Phase 3: Add deepseek — provider folder, static providerInfo.ts, registry registration, llms seed rows, chat streaming, tool calling.
MVP constraints: chat streaming only through shared layer; tool calling only for models seeded with has_tools = true; preserve reasoning_content only within same tool loop turn; featureSupport.nativeImageGeneration = false; featureSupport.embeddings = false; enable structuredOutput and historyExtraction only if validated end-to-end; implement DeepSeek beta prefix completion for assistant prefills; no image/embedding rows unless implemented; if /tool estimate cost should support DeepSeek, add minimal non-streaming prompt-token probe.
Recommended provider-local files:
src/providers/deepseek/ providerInfo.ts deepseekProvider.ts deepseekStreamAdapter.ts deepseekToolAdapter.tsSeed scope: one default general chat model; optionally one reasoning model; per-model flags set conservatively from validated behavior, not vendor marketing copy.
Phase 4: Add zai — same pattern as deepseek; enable only features confirmed by seeded models and runtime wiring.
Phase 5: Add nvidia — keep supported model set small and curated; wire provider-owned embeddings and native image generation only when exact NVIDIA endpoint contract is implemented.
Phase 6: Optional Embedding Decoupling — allow /model embedding to choose from any seeded embedding provider; stop coupling embedding selection to the active chat provider.
Capability Checklist For Future OpenAI-Compatible Providers
Section titled “Capability Checklist For Future OpenAI-Compatible Providers”When adding the next vendor in this family, verify these areas explicitly:
- Tool calls: confirm per-model support before seeding
has_tools = true; verify the exact assistant/tool message format - Structured output: confirm whether the provider supports strict schema mode or only JSON-object mode; implement prompt shaping + local validation for JSON-object-only providers
- Reasoning/thinking mode: check for replay-only fields like DeepSeek
reasoning_content; preserve only where vendor requires; mapthinking_levelor document deliberate no-op - Assistant prefills: check whether native prefix completion requires a beta endpoint or message flag like
prefix: true - Live cost estimation: prefer API-reported prompt token usage; use provider-specific pricing sources; document any cache-hit/miss caveats
- Image generation and embeddings: only seed rows if the app runtime path is actually implemented; leave feature flags off otherwise
- Media placeholder contract: when your stream adapter encounters an image or video part but
seesImages/seesVideosis false, emit a text placeholder — never silently skip. Context may include media parts for fallback models even when the primary cannot render them.
Acceptance Criteria For The Refactor
Section titled “Acceptance Criteria For The Refactor”The extraction is successful when:
customstill works without behavior regression- new OpenAI-compatible providers can be added without cloning large stream/tool files
- unsupported provider features fail cleanly
- provider metadata remains the source of truth for app-level feature gating
- per-model DB flags remain the source of truth for chat-model capability gating
The combined extraction + DeepSeek slice is successful when custom still works, deepseek is auto-discovered, setup/switch/validation commands work, chat streaming and tool calling work for seeded models, structured output and history extraction work only for validated models, deepseek-reasoner tool continuation preserves reasoning_content within the same turn, assistant prefills use DeepSeek beta prefix completion, and /tool estimate cost returns a live estimate.
Files To Keep In View
Section titled “Files To Keep In View”src/providers/custom/customProvider.tssrc/providers/custom/customStreamAdapter.tssrc/providers/custom/customToolAdapter.tssrc/providers/openrouter/openrouterStreamAdapter.tssrc/utils/provider/providerInfoRegistry.tssrc/utils/provider/providerCapabilityResolver.tssrc/providers/utils/providerFeatureExecutors.tssrc/commands/config/api-key/set.tssrc/commands/model/image.tssrc/commands/model/embedding.tssrc/utils/db/repositories/LlmRepository.ts
Practical Recommendation
Section titled “Practical Recommendation”If you want the fastest route to shipping new vendors:
- Extract the shared OpenAI-compatible stream/tool/message layer.
- Migrate
customfirst. - Add
deepseek. - Add
zai. - Add
nvidiawith a curated model inventory and provider-owned embedding/image helpers.
Do not block that work on Vertex AI or Codex CLI.
Handoff Scope
Section titled “Handoff Scope”If you want another agent to implement this in one pass, give it this bounded scope:
- Implement Phase 1 and Phase 2.
- Implement only the DeepSeek MVP described in Phase 3.
- Do not start Z.ai, Vertex AI, or Codex CLI.
- Preserve current
custombehavior. - Run
bun run checkandbun run lint.
Recommended handoff prompt:
Implement Phase 1 and Phase 2 from the OpenAI-Compatible Providers section in docs/guides/adding-new-provider.md, then implement only the bounded DeepSeek MVP from the same guide.
Constraints:- preserve current custom-provider behavior- keep OpenRouter untouched unless a tiny shared extraction is unavoidable- seed DeepSeek text models conservatively- do not implement DeepSeek embeddings or native image generation- do not start Z.ai, Vertex AI, or Codex CLI
Validation:- bun run check- bun run lint