CodeepCodeep

AI Providers

Codeep supports all major AI providers. You can configure multiple providers and switch between them at any time using /provider or /settings.

Set your API key as an environment variable and Codeep will detect it automatically on startup, no configuration needed.

Supported Providers

ProviderEnv VariableDefault ModelProtocol
AnthropicANTHROPIC_API_KEYclaude-opus-4-8Anthropic
Custom (OpenAI-compatible)none requiredyour modelOpenAI-compatible
DeepSeekDEEPSEEK_API_KEYdeepseek-v4-proOpenAI-compatible
Google AIGOOGLE_API_KEYgemini-3.1-pro-previewOpenAI-compatible
Grok (xAI)XAI_API_KEYgrok-build-0.1OpenAI-compatible
Kimi (Kimi Code subscription)KIMI_CODE_API_KEYkimi-for-codingOpenAI-compatible
Kimi API (pay-per-use)MOONSHOT_API_KEYkimi-k2.7-codeOpenAI-compatible
Kimi ChinaMOONSHOT_CN_API_KEYkimi-k2.7-codeOpenAI-compatible
MiniMax (Coding Plan)MINIMAX_API_KEYMiniMax-M3OpenAI / Anthropic
MiniMax API (pay-per-use)MINIMAX_API_KEYMiniMax-M3OpenAI-compatible
MiniMax ChinaMINIMAX_CN_API_KEYMiniMax-M3OpenAI / Anthropic
ModelScope (free Qwen)MODELSCOPE_API_KEYQwen3-Coder-480BOpenAI-compatible
Ollama (local / remote)none requireddynamicOpenAI-compatible
OpenAIOPENAI_API_KEYgpt-5.5OpenAI-compatible
OpenRouter (100+ models)OPENROUTER_API_KEYanthropic/claude-opus-4OpenAI-compatible
Qwen (Coding Plan subscription)BAILIAN_CODING_PLAN_API_KEYqwen3-coder-plusOpenAI-compatible
Qwen API (pay-per-use)DASHSCOPE_API_KEYqwen3-coder-plusOpenAI-compatible
Qwen China (Coding Plan / API)BAILIAN_CODING_PLAN_CN_API_KEY / DASHSCOPE_CN_API_KEYqwen3-coder-plusOpenAI-compatible
Z.AI (GLM Coding Plan)ZAI_API_KEYglm-5.2OpenAI / Anthropic
Z.AI API (pay-per-use)ZAI_API_KEYglm-5.2OpenAI-compatible
Z.AI China (GLM Coding Plan)ZAI_CN_API_KEYglm-5.2OpenAI / Anthropic
Z.AI China API (pay-per-use)ZAI_CN_API_KEYglm-5.2OpenAI-compatible
Coding-plan subscriptions.Kimi (Kimi Code), Qwen (Coding Plan), Z.AI (GLM Coding Plan) and MiniMax all offer a flat-fee subscription you can drive directly — Codeep points at the plan's dedicated endpoint with your subscription key, so there are no per-token charges. Each also has a separate pay-per-use API entry. (Grok adds Kimi/Qwen- style coding models via a pay-per-use xAI key; its SuperGrok subscription isn't API-drivable.)

OpenRouter (Aggregator)

One API key, 100+ models from Anthropic, OpenAI, Google, Meta, Mistral, DeepSeek, Qwen, xAI and more. Useful when you want to experiment with multiple providers without managing separate keys, or want OpenRouter to auto-route to the best model for the task.

export OPENROUTER_API_KEY=sk-or-v1-... codeep /provider openrouter codeep /model # fetches the live 100+ model catalogue with pricing

Or, save your key once on the dashboard under Provider keys and sync to any machine with codeep account sync.

Codeep uses the cost OpenRouter reports per call (in usage.cost) — your dashboard sees the same figures as your OpenRouter invoice, no local pricing lookup needed.

Routing preferences

OpenRouter lets you bias which upstream provider its router picks for a given model (latency, cost, geography, privacy). Configure with:

/openrouter # show current preferences /openrouter prefer DeepInfra,Together # try these first in order /openrouter ignore OpenAI # never route through these /openrouter fallbacks on|off # allow fallback when preferred providers fail /openrouter privacy strict|allow # strict = data_collection: deny /openrouter clear # drop all preferences
Pick openrouter/auto as your model and OpenRouter chooses the best provider for each task. Combine with /openrouter prefer to bias the auto-router without locking it down.

Ollama (Local AI)

Run AI models fully locally — or on a remote server — without an API key. Codeep fetches your installed models automatically via the Ollama API.

Local setup

Install Ollama, pull a model, and start the server:

ollama pull qwen2.5-coder:7b ollama serve

Then in Codeep, select the provider and pick your model:

/provider → select "ollama" /model → pick from installed models (shows on-disk size) /model browse → curated catalog of coding models; pick one to pull /model rm <m> → remove a local model to reclaim disk

New to local models? /model browse lists recommended coding models (Qwen2.5 Coder, DeepSeek Coder V2, Llama 3.1, DeepSeek R1, …) with parameter sizes, rough VRAM, and an agent-mode hint — select one and Codeep pulls it for you.

Remote Ollama (different machine)

If Ollama runs on another machine (e.g. a home server, NAS, or Docker container), start it with OLLAMA_HOST=0.0.0.0 so it accepts external connections:

OLLAMA_HOST=0.0.0.0 ollama serve

Then in Codeep, open /settings and update the Ollama URL field to point to your server:

http://192.168.1.100:11434
Use /model after switching to the Ollama provider to see and select all models installed on your server. No manual configuration needed.

Agent mode and model size

Codeep's agent mode requires a model capable of reliable tool use and instruction following. With Ollama, use at least a 7B parameter model for agent tasks — for example qwen2.5-coder:7b or llama3.1:8b.

Models smaller than 7B (1B–3B) often fail to produce correctly formatted tool calls and may behave unexpectedly in agent mode. For small models, set Agent Mode to Manual or Off in /settings and use Codeep as a chat assistant instead.

Native API (beta)

By default Codeep talks to Ollama through its OpenAI-compatible /v1 endpoint, which ignores a couple of Ollama-specific options. Turn on Ollama Native API (beta) in /settings to use the native /api/chat endpoint instead, which honors:

num_ctx— the model uses its full context window instead of Ollama's small default (auto-detected from /api/show, or set ollamaNumCtx).
keep_alive — keeps the model resident between turns, avoiding reload latency (ollamaKeepAlive, default 30m).

It's marked betawhile it gets real-world coverage across more models and longer agent sessions — it's off by default, so nothing changes unless you opt in. Hit a problem? Please open an issue on GitHub — feedback decides when it becomes the default.

Custom (OpenAI-compatible endpoints)

Run your own model behind an OpenAI-compatible server — vLLM, LiteLLM, LM Studio, or text-generation-webui — and point Codeep straight at it. No commercial provider or API key required.

Pick Custom (OpenAI-compatible) in the welcome screen or with /provider, then set the endpoint under /settingsCustom Base URL (config key customBaseUrl). Use the full base, including /v1:

# ~/.codeep/config.json { "provider": "custom", "customBaseUrl": "http://100.88.112.5:8000/v1", "model": "qwen3-coder-30b" }

Then run /model to pick from the models your server advertises (fetched live from its /models endpoint). If your endpoint requires a key, set one with /login; otherwise leave it blank.

Prefer environment variables? The OpenAI provider honors the standard OPENAI_BASE_URL variable, so a proxy that serves gpt-* model names works with no config changes — just export OPENAI_BASE_URL and use the OpenAI provider.

Switching providers

Use the /provider command to interactively switch between configured providers. The new provider takes effect immediately for the next message.

Using multiple providers

You can configure API keys for multiple providers. Codeep stores all keys securely. Use /login to add a key for any provider without changing the active one.