Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.litellm-agent-platform.ai/llms.txt

Use this file to discover all available pages before exploring further.

The lap CLI talks to a running LAP instance. If you haven’t deployed one yet, see Installation.
First time? Install the CLI:
git clone https://github.com/BerriAI/litellm-agent-platform.git
cd litellm-agent-platform/cli && npm install
ln -sf "$PWD/bin/lap.mjs" ~/.local/bin/lap

1. Log in

lap login
Paste your LAP URL and MASTER_KEY when prompted. Saved to ~/.lap/config.json.

2. Open an agent

lap
Pick your Gemini agent from the interactive list. lap spins up a Kubernetes-sandboxed pod running the Google Gemini CLI in --yolo mode, wraps it in a tmux session so it survives WS reconnects (lap --resume <id> lands you on the same in-progress REPL), and attaches your local terminal to its TTY over a WebSocket. The harness routes Gemini through your LiteLLM gateway by mapping LITELLM_API_KEY β†’ GEMINI_API_KEY and LITELLM_API_BASE β†’ GOOGLE_GEMINI_BASE_URL + /gemini at boot β€” so Gemini talks to your LiteLLM proxy’s Gemini passthrough, and your proxy fans out to Google with whatever upstream credentials it’s configured with. Press Ctrl-D to detach β€” the session stays alive for 24h.

Creating an agent

In the UI choose gemini from the Harness picker and pick a Gemini model, or via API:
curl -X POST $LAP_URL/api/v1/managed_agents/agents \
  -H "Authorization: Bearer $MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name":"my-gemini","harness_id":"gemini","model":"gemini/gemini-2.5-flash"}'
That’s it. No env vars to set β€” the platform’s LITELLM_API_KEY and LITELLM_API_BASE are wired through automatically.

Prerequisite: register a Gemini model on your LiteLLM proxy

Unlike claude-code, codex, and hermes (which speak OpenAI’s protocol and hit LiteLLM’s /v1/chat/completions), Gemini CLI speaks Google’s native Gemini API. LiteLLM exposes that as a passthrough route at LITELLM_API_BASE/gemini/v1beta/... β€” but the proxy needs at least one gemini/* model registered in its config.yaml with a real Google API key behind it. Example config.yaml entry:
model_list:
  - model_name: gemini/gemini-2.5-flash
    litellm_params:
      model: gemini/gemini-2.5-flash
      api_key: os.environ/GEMINI_API_KEY
  - model_name: gemini/gemini-2.5-pro
    litellm_params:
      model: gemini/gemini-2.5-pro
      api_key: os.environ/GEMINI_API_KEY
The platform’s LITELLM_API_KEY must also be a virtual key (starts with sk-) β€” not the proxy master key. Mint one with:
curl -X POST $LITELLM_PROXY_URL/key/generate \
  -H "Authorization: Bearer $LITELLM_PROXY_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"models":["gemini/gemini-2.5-flash","gemini/gemini-2.5-pro"]}'
Verify the route is healthy before creating an agent:
curl -X POST $LITELLM_API_BASE/gemini/v1beta/models/gemini-2.5-flash:generateContent \
  -H "x-goog-api-key: $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"contents":[{"parts":[{"text":"say hi"}]}]}'
If you get a 200, you’re done. If you get "no healthy deployments for this model", the Gemini model isn’t registered on the proxy yet.

Picking a model

Gemini CLI’s auto-default sometimes lands on a preview model your project doesn’t have access to. Pin a stable model inside the TUI:
/model gemini-2.5-flash
Models known to work broadly: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash.

Advanced: bypass the LiteLLM proxy

If your LiteLLM deployment doesn’t have a Gemini model registered and you need to get up and running fast, you can bring your own Google credentials. The harness detects either and uses it instead of the LiteLLM passthrough.

Option A β€” Bring a Gemini API Key

Mint one at aistudio.google.com/app/apikey and pass it as an agent env var:
curl -X POST $LAP_URL/api/v1/managed_agents/agents \
  -H "Authorization: Bearer $MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name":"my-gemini",
    "harness_id":"gemini",
    "model":"gemini/gemini-2.5-flash",
    "env_vars":{"GEMINI_API_KEY":"AIzaSy…"}
  }'
The vault sidecar mints a stub for the key β€” the agent process only ever sees stub_…, and vault swaps for the real value at the wire.

Option B β€” Vertex AI service account

For Vertex AI access (different model catalog, no rate-limit surprises from AI Studio quotas), put your SA JSON in a private gist and point the agent at it:
gh gist create --secret /path/to/sa-key.json -d "lap-gemini-sa"
# β†’ https://gist.github.com/<you>/<gist-id>

curl -X POST $LAP_URL/api/v1/managed_agents/agents \
  -H "Authorization: Bearer $MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name":"my-gemini",
    "harness_id":"gemini",
    "model":"gemini/gemini-2.5-flash",
    "repo_url":"https://gist.github.com/<you>/<gist-id>.git",
    "branch":"main",
    "env_vars":{"GITHUB_TOKEN":"<your gh token>"}
  }'
When /work/repo/key.json is present after the clone, the entrypoint auto-exports GOOGLE_APPLICATION_CREDENTIALS, GOOGLE_GENAI_USE_VERTEXAI=true, GOOGLE_CLOUD_PROJECT (extracted from the JSON), GOOGLE_CLOUD_LOCATION=us-central1, and GEMINI_CLI_TRUST_WORKSPACE=true. The CLI picks it all up automatically.

Smoke-testing without the TUI

Set GEMINI_SELFTEST_PROMPT (any value) in the agent’s env_vars and the pod runs gemini -m gemini-2.5-flash -p "..." once at boot. The reply lands in pod stdout β€” readable via the diagnose endpoint without needing the WebSocket /tty flow:
curl $LAP_URL/api/v1/managed_agents/sessions/<sid>/diagnose \
  -H "Authorization: Bearer $MASTER_KEY" \
  | jq -r '.pod_logs_tail.text'
# β†’
# [selftest] running: gemini -m gemini-2.5-flash -p "Reply with exactly four words: hello from gemini cli"
# [selftest-begin]
# hello from gemini cli
# [selftest-end]
Useful when verifying fresh credentials end-to-end, or proving the harness works in CI without a browser.