AI Providers

Codeep supports all major AI providers. You can configure multiple providers and switch between them at any time using /provider or /settings.

★

Set your API key as an environment variable and Codeep will detect it automatically on startup, no configuration needed.

Supported Providers

Provider	Env Variable	Default Model	Protocol
`Anthropic`	ANTHROPIC_API_KEY	claude-opus-4-8	Anthropic
`Custom (OpenAI-compatible)`	none required	your model	OpenAI-compatible
`DeepSeek`	DEEPSEEK_API_KEY	deepseek-v4-pro	OpenAI-compatible
`Google AI`	GOOGLE_API_KEY	gemini-3.1-pro-preview	OpenAI-compatible
`Grok (xAI)`	XAI_API_KEY	grok-build-0.1	OpenAI-compatible
`Kimi (Kimi Code subscription)`	KIMI_CODE_API_KEY	kimi-for-coding	OpenAI-compatible
`Kimi API (pay-per-use)`	MOONSHOT_API_KEY	kimi-k2.7-code	OpenAI-compatible
`Kimi China`	MOONSHOT_CN_API_KEY	kimi-k2.7-code	OpenAI-compatible
`MiniMax (Coding Plan)`	MINIMAX_API_KEY	MiniMax-M3	OpenAI / Anthropic
`MiniMax API (pay-per-use)`	MINIMAX_API_KEY	MiniMax-M3	OpenAI-compatible
`MiniMax China`	MINIMAX_CN_API_KEY	MiniMax-M3	OpenAI / Anthropic
`ModelScope (free Qwen)`	MODELSCOPE_API_KEY	Qwen3-Coder-480B	OpenAI-compatible
`Ollama (local / remote)`	none required	dynamic	OpenAI-compatible
`OpenAI`	OPENAI_API_KEY	gpt-5.5	OpenAI-compatible
`OpenRouter (100+ models)`	OPENROUTER_API_KEY	anthropic/claude-opus-4	OpenAI-compatible
`Qwen (Coding Plan subscription)`	BAILIAN_CODING_PLAN_API_KEY	qwen3-coder-plus	OpenAI-compatible
`Qwen API (pay-per-use)`	DASHSCOPE_API_KEY	qwen3-coder-plus	OpenAI-compatible
`Qwen China (Coding Plan / API)`	BAILIAN_CODING_PLAN_CN_API_KEY / DASHSCOPE_CN_API_KEY	qwen3-coder-plus	OpenAI-compatible
`Z.AI (GLM Coding Plan)`	ZAI_API_KEY	glm-5.2	OpenAI / Anthropic
`Z.AI API (pay-per-use)`	ZAI_API_KEY	glm-5.2	OpenAI-compatible
`Z.AI China (GLM Coding Plan)`	ZAI_CN_API_KEY	glm-5.2	OpenAI / Anthropic
`Z.AI China API (pay-per-use)`	ZAI_CN_API_KEY	glm-5.2	OpenAI-compatible

★

Coding-plan subscriptions.Kimi (Kimi Code), Qwen (Coding Plan), Z.AI (GLM Coding Plan) and MiniMax all offer a flat-fee subscription you can drive directly — Codeep points at the plan's dedicated endpoint with your subscription key, so there are no per-token charges. Each also has a separate pay-per-use API entry. (Grok adds Kimi/Qwen- style coding models via a pay-per-use xAI key; its SuperGrok subscription isn't API-drivable.)

OpenRouter (Aggregator)

One API key, 100+ models from Anthropic, OpenAI, Google, Meta, Mistral, DeepSeek, Qwen, xAI and more. Useful when you want to experiment with multiple providers without managing separate keys, or want OpenRouter to auto-route to the best model for the task.

export OPENROUTER_API_KEY=sk-or-v1-...
codeep /provider openrouter
codeep /model                        # fetches the live 100+ model catalogue with pricing

Or, save your key once on the dashboard under Provider keys and sync to any machine with codeep account sync.

Codeep uses the cost OpenRouter reports per call (in usage.cost) — your dashboard sees the same figures as your OpenRouter invoice, no local pricing lookup needed.

Routing preferences

OpenRouter lets you bias which upstream provider its router picks for a given model (latency, cost, geography, privacy). Configure with:

/openrouter                      # show current preferences
/openrouter prefer DeepInfra,Together   # try these first in order
/openrouter ignore OpenAI               # never route through these
/openrouter fallbacks on|off            # allow fallback when preferred providers fail
/openrouter privacy strict|allow        # strict = data_collection: deny
/openrouter clear                       # drop all preferences

★

Pick openrouter/auto as your model and OpenRouter chooses the best provider for each task. Combine with /openrouter prefer to bias the auto-router without locking it down.

Ollama (Local AI)

Run AI models fully locally — or on a remote server — without an API key. Codeep fetches your installed models automatically via the Ollama API.

Local setup

Install Ollama, pull a model, and start the server:

ollama pull qwen2.5-coder:7b
ollama serve

Then in Codeep, select the provider and pick your model:

/provider     → select "ollama"
/model        → pick from installed models (shows on-disk size)
/model browse → curated catalog of coding models; pick one to pull
/model rm <m> → remove a local model to reclaim disk

New to local models? /model browse lists recommended coding models (Qwen2.5 Coder, DeepSeek Coder V2, Llama 3.1, DeepSeek R1, …) with parameter sizes, rough VRAM, and an agent-mode hint — select one and Codeep pulls it for you.

Remote Ollama (different machine)

If Ollama runs on another machine (e.g. a home server, NAS, or Docker container), start it with OLLAMA_HOST=0.0.0.0 so it accepts external connections:

OLLAMA_HOST=0.0.0.0 ollama serve

Then in Codeep, open /settings and update the Ollama URL field to point to your server:

http://192.168.1.100:11434

★

Use /model after switching to the Ollama provider to see and select all models installed on your server. No manual configuration needed.

Agent mode and model size

Codeep's agent mode requires a model capable of reliable tool use and instruction following. With Ollama, use at least a 7B parameter model for agent tasks — for example qwen2.5-coder:7b or llama3.1:8b.

⚠

Models smaller than 7B (1B–3B) often fail to produce correctly formatted tool calls and may behave unexpectedly in agent mode. For small models, set Agent Mode to Manual or Off in /settings and use Codeep as a chat assistant instead.

Native API (beta)

By default Codeep talks to Ollama through its OpenAI-compatible /v1 endpoint, which ignores a couple of Ollama-specific options. Turn on Ollama Native API (beta) in /settings to use the native /api/chat endpoint instead, which honors:

• num_ctx— the model uses its full context window instead of Ollama's small default (auto-detected from /api/show, or set ollamaNumCtx).
• keep_alive — keeps the model resident between turns, avoiding reload latency (ollamaKeepAlive, default 30m).

★

It's marked betawhile it gets real-world coverage across more models and longer agent sessions — it's off by default, so nothing changes unless you opt in. Hit a problem? Please open an issue on GitHub — feedback decides when it becomes the default.

Custom (OpenAI-compatible endpoints)

Run your own model behind an OpenAI-compatible server — vLLM, LiteLLM, LM Studio, or text-generation-webui — and point Codeep straight at it. No commercial provider or API key required.

Pick Custom (OpenAI-compatible) in the welcome screen or with /provider, then set the endpoint under /settings → Custom Base URL (config key customBaseUrl). Use the full base, including /v1:

# ~/.codeep/config.json
{
  "provider": "custom",
  "customBaseUrl": "http://100.88.112.5:8000/v1",
  "model": "qwen3-coder-30b"
}

Then run /model to pick from the models your server advertises (fetched live from its /models endpoint). If your endpoint requires a key, set one with /login; otherwise leave it blank.

★

Prefer environment variables? The OpenAI provider honors the standard OPENAI_BASE_URL variable, so a proxy that serves gpt-* model names works with no config changes — just export OPENAI_BASE_URL and use the OpenAI provider.

Switching providers

Use the /provider command to interactively switch between configured providers. The new provider takes effect immediately for the next message.

Using multiple providers

You can configure API keys for multiple providers. Codeep stores all keys securely. Use /login to add a key for any provider without changing the active one.

Configuration Project Rules