Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.litellm-agent-platform.ai/llms.txt

Use this file to discover all available pages before exploring further.

This page answers the questions that come up most often when running LiteLLM Agent Platform. If you have a question that is not covered here, open an issue on GitHub or join the Discord.
A session stays alive for 24 hours of message inactivity. If no message is sent to the session within that window, the reconciler reaps the pod and marks the session dead.The 24-hour clock resets on every message. A session with ongoing activity can run indefinitely. If you detach from a TUI session with Ctrl-D, the session and its sandbox pod keep running — only the WebSocket connection closes.
Yes. Run lap <agent-name> again after detaching. The CLI looks up the most recent session for that agent and reconnects if one is still alive.Support for lap attach <session-id> (to reconnect to a specific session by ID) is planned but not yet available.
Credentials are never present in the sandbox environment as real values. Every sandbox pod runs a vault sidecar that intercepts all outbound HTTPS traffic via an in-process proxy. At startup, the sidecar replaces real credential values with stub placeholders (for example, GITHUB_TOKEN=stub_github_a8f1). On every outbound TLS connection, the vault swaps the stub back for the real value at the wire level.The agent process can call echo $GITHUB_TOKEN and only ever sees the stub — the real key is never accessible from inside the sandbox. This means agents can run with broad permissions without the risk of leaking credentials through logs, LLM context windows, or accidental output.
Enable the warm pool by setting WARM_POOL_SIZE to a value greater than zero (the default is already 2).With the warm pool active:
  • Cold start (no warm pod available): ~10–12 s, dominated by git clone and harness boot.
  • Warm start (pod claimed from the pool): ~1.8 s end-to-end.
The warm pool pre-provisions pods for the agents most recently used (within the last WARM_POOL_RECENT_AGENT_HOURS hours). After the first session on a new deployment, the worker tops up the pool so the next session is near-instant.To further reduce cold-start times, bake your agent’s repository into the harness image so the git clone step is skipped at boot.
Choose based on how you want to interact with the agent:
HarnessModeBest for
claude-codeTUIInteractive terminal sessions; attaches your local terminal to the agent’s PTY
codexTUIInteractive terminal sessions with OpenAI Codex
opencodeAPIProgrammatic automation; send messages via the REST API
claude-agent-sdkAPIProgrammatic automation using Anthropic’s Agent SDK
TUI harnesses stream the agent’s terminal output to your local terminal over WebSocket. API harnesses expose a JSON message API that you call from code or the web UI.
Yes. All outbound traffic from the sandbox is routed through the vault sidecar proxy, which allows full internet access by default.To restrict which domains an agent can reach, set allow_out or deny_out on the agent definition. allow_out is an allowlist — only the listed domains are reachable. deny_out is a denylist — the listed domains are blocked while everything else is allowed.
The default local kind cluster configuration supports up to 100 concurrent live sessions. This limit comes from the NodePort range (30000–30099) configured by bin/kind-up.sh — each live session occupies one NodePort.For higher concurrency, switch the Service type from NodePort to ClusterIP and add an ingress controller with hostname-based routing. This removes the NodePort range cap. On EKS, concurrency is bounded only by cluster node capacity.
Credentials are never stored in the sandbox pod. They live only in the platform’s environment (or the litellm-env Kubernetes Secret in production) and are injected at the network level by the vault sidecar at egress time.The pod’s environment contains only stub placeholders. The real values travel on the wire inside the vault proxy’s TLS tunnel and are never written to disk, logged, or exposed to the agent process.