This page answers the questions that come up most often when running LiteLLM Agent Platform. If you have a question that is not covered here, open an issue on GitHub or join the Discord.Documentation Index
Fetch the complete documentation index at: https://docs.litellm-agent-platform.ai/llms.txt
Use this file to discover all available pages before exploring further.
How long do sessions stay alive?
How long do sessions stay alive?
A session stays alive for 24 hours of message inactivity. If no message is sent to the session within that window, the reconciler reaps the pod and marks the session
dead.The 24-hour clock resets on every message. A session with ongoing activity can run indefinitely. If you detach from a TUI session with Ctrl-D, the session and its sandbox pod keep running — only the WebSocket connection closes.Can I reconnect to a session after detaching?
Can I reconnect to a session after detaching?
Yes. Run
lap <agent-name> again after detaching. The CLI looks up the most recent session for that agent and reconnects if one is still alive.Support for lap attach <session-id> (to reconnect to a specific session by ID) is planned but not yet available.What happens to credentials in the sandbox?
What happens to credentials in the sandbox?
Credentials are never present in the sandbox environment as real values. Every sandbox pod runs a vault sidecar that intercepts all outbound HTTPS traffic via an in-process proxy. At startup, the sidecar replaces real credential values with stub placeholders (for example,
GITHUB_TOKEN=stub_github_a8f1). On every outbound TLS connection, the vault swaps the stub back for the real value at the wire level.The agent process can call echo $GITHUB_TOKEN and only ever sees the stub — the real key is never accessible from inside the sandbox. This means agents can run with broad permissions without the risk of leaking credentials through logs, LLM context windows, or accidental output.How do I speed up session start times?
How do I speed up session start times?
Enable the warm pool by setting
WARM_POOL_SIZE to a value greater than zero (the default is already 2).With the warm pool active:- Cold start (no warm pod available): ~10–12 s, dominated by
git cloneand harness boot. - Warm start (pod claimed from the pool): ~1.8 s end-to-end.
WARM_POOL_RECENT_AGENT_HOURS hours). After the first session on a new deployment, the worker tops up the pool so the next session is near-instant.To further reduce cold-start times, bake your agent’s repository into the harness image so the git clone step is skipped at boot.What harness should I use?
What harness should I use?
Choose based on how you want to interact with the agent:
TUI harnesses stream the agent’s terminal output to your local terminal over WebSocket. API harnesses expose a JSON message API that you call from code or the web UI.
| Harness | Mode | Best for |
|---|---|---|
claude-code | TUI | Interactive terminal sessions; attaches your local terminal to the agent’s PTY |
codex | TUI | Interactive terminal sessions with OpenAI Codex |
opencode | API | Programmatic automation; send messages via the REST API |
claude-agent-sdk | API | Programmatic automation using Anthropic’s Agent SDK |
Can agents access the internet?
Can agents access the internet?
Yes. All outbound traffic from the sandbox is routed through the vault sidecar proxy, which allows full internet access by default.To restrict which domains an agent can reach, set
allow_out or deny_out on the agent definition. allow_out is an allowlist — only the listed domains are reachable. deny_out is a denylist — the listed domains are blocked while everything else is allowed.How many concurrent sessions can I run?
How many concurrent sessions can I run?
The default local kind cluster configuration supports up to 100 concurrent live sessions. This limit comes from the NodePort range (
30000–30099) configured by bin/kind-up.sh — each live session occupies one NodePort.For higher concurrency, switch the Service type from NodePort to ClusterIP and add an ingress controller with hostname-based routing. This removes the NodePort range cap. On EKS, concurrency is bounded only by cluster node capacity.Where are credentials stored?
Where are credentials stored?
Credentials are never stored in the sandbox pod. They live only in the platform’s environment (or the
litellm-env Kubernetes Secret in production) and are injected at the network level by the vault sidecar at egress time.The pod’s environment contains only stub placeholders. The real values travel on the wire inside the vault proxy’s TLS tunnel and are never written to disk, logged, or exposed to the agent process.