AI Models

AI Models

Agentbase supports multiple AI providers out of the box. Configure and manage models through the dashboard or API.

Supported Providers

OpenAI

Models: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo

# Set your API key
OPENAI_API_KEY=sk-...

Anthropic

Models: Claude 3 Opus, Claude 3 Sonnet, Claude 3 Haiku

ANTHROPIC_API_KEY=sk-ant-...

Google Gemini

Models: Gemini Pro, Gemini Pro Vision

GEMINI_API_KEY=AI...

HuggingFace

Access thousands of open-source models via HuggingFace Inference API.

HUGGINGFACE_API_KEY=hf_...

Configuration

Via Dashboard

  1. Navigate to Dashboard → AI Models
  2. Select your application
  3. Click + Add Model
  4. Choose provider and model
  5. Configure temperature, max tokens, and system prompt
  6. Save

Via API

# Create a model configuration
POST /api/model-configs
{
  "applicationId": "app-uuid",
  "provider": "openai",
  "modelId": "gpt-4",
  "displayName": "GPT-4 Production",
  "isDefault": true,
  "settings": {
    "temperature": 0.7,
    "maxTokens": 2048,
    "topP": 1,
    "systemPrompt": "You are a helpful assistant."
  }
}

Model Settings

SettingDescriptionRange
temperatureControls randomness0.0 - 2.0
maxTokensMaximum response length1 - model max
topPNucleus sampling threshold0.0 - 1.0
frequencyPenaltyReduces repetition-2.0 - 2.0
presencePenaltyEncourages topic diversity-2.0 - 2.0
systemPromptSystem instruction for the modeltext

A/B Testing

Model configurations support versioning for A/B testing:

  1. Create multiple versions of a model config
  2. Assign traffic percentages to each version
  3. Monitor metrics (latency, token usage, error rate)
  4. Promote the winning version
# Get config versions
GET /api/model-configs/:id/versions
 
# Create a new version
POST /api/model-configs/:id/versions
{
  "label": "Higher temperature variant",
  "settings": { "temperature": 0.9 },
  "trafficPercent": 20
}

API Key Management

Agentbase supports two modes for provider API keys, which can be used together:

Platform Keys (Paid Plans)

On Starter, Pro, and Enterprise plans the platform supplies the AI provider keys. Usage is metered and included in your plan quota. Overages are billed automatically via Stripe at a per-1,000-token rate based on your plan tier.

PlanIncluded tokens/monthOverage (per 1K tokens)
Starter100,000$0.001
Pro500,000$0.0008
EnterpriseUnlimitedCustom

BYOK — Bring Your Own Key (All Plans)

Any user on any plan (including Free) can add their own provider API keys. BYOK keys:

  • Are encrypted at rest with AES-256-GCM before being stored in the database.
  • Are never logged or returned in full — only a 4-character hint is shown in the UI.
  • Are transmitted to the AI service over the internal network only — they never reach the browser.
  • Are decrypted ephemerally per request and not cached across requests.
  • Bypass the monthly quota gate entirely — requests using BYOK keys do not count against your plan allowance.

To add a key, go to Dashboard → Settings → AI Providers and paste your key for any provider. You can validate it immediately and remove it at any time.

# Save a provider key
PUT /api/provider-keys/openai
Authorization: Bearer <your-token>
{ "apiKey": "sk-..." }
 
# Validate a saved key (makes a live test call)
POST /api/provider-keys/openai/validate

See Provider Keys API for the full reference.

Provider Abstraction

The AI service uses an abstraction layer so switching providers is seamless:

# The provider registry handles initialization (platform keys)
ProviderRegistry.initialize(
    openai_key="sk-...",
    anthropic_key="sk-ant-...",
    gemini_key="AI...",
    huggingface_key="hf_...",
)
 
# Standard (platform-key) provider
provider = ProviderRegistry.get("openai")
response = await provider.chat(chat_request)
 
# Ephemeral BYOK provider — not cached, key never stored in registry
byok_provider = ProviderRegistry.get_ephemeral("openai", decrypted_key)
response = await byok_provider.chat(chat_request)

AI Service API

The Python AI microservice (http://localhost:8000) exposes these endpoints directly:

List Providers & Models

# All available providers and their configured models
GET http://localhost:8000/api/ai/providers
 
# Models for a specific provider
GET http://localhost:8000/api/ai/providers/openai/models

Chat (non-streaming)

POST http://localhost:8000/api/ai/conversations/{id}/messages
Content-Type: application/json
 
{
  "content": "What is the capital of France?",
  "provider": "openai",
  "model": "gpt-4",
  "temperature": 0.7,
  "max_tokens": 1024,
  "system_prompt": "You are a helpful assistant."
}

Chat (streaming via SSE)

POST http://localhost:8000/api/ai/conversations/{id}/stream
Content-Type: application/json
 
{
  "content": "Write a short poem about the sea.",
  "provider": "anthropic",
  "model": "claude-3-sonnet-20240229",
  "temperature": 0.8
}

Response is a stream of Server-Sent Events. Each event is a JSON object:

data: {"type": "chunk", "content": "The "}
data: {"type": "chunk", "content": "waves "}
data: {"type": "done", "fullResponse": "The waves crash..."}

In most cases you should use the core API (http://localhost:3001/api) for conversations — the AI service is an internal dependency. Use these endpoints directly only when building integrations that bypass the core API.