Security advisory — Malicious litellm versions 1.82.7 and 1.82.8 were removed from PyPI (potential API key exfiltration). Uninstall them, rotate exposed credentials, and upgrade to a safe release (e.g. 1.82.9+ per upstream). Run pip show litellm to verify. PyPI · README

LLM Providers

LLM providers supported by AgenticX.

LLM providers

AgenticX routes chat and tools through BaseLLMProvider implementations. First-party adapters cover major Chinese cloud APIs; everything else can go through LiteLLMProvider (OpenAI-compatible or LiteLLM model IDs).


Supported providers

ProviderPrimary Python classVisionNotes
OpenAIOpenAIProvider (LiteLLMProvider)Yes (model-dependent)Default stack; vision on GPT-4o-class and similar
AnthropicAnthropicProvider (LiteLLMProvider)Yes (model-dependent)Use anthropic/ model prefix when required
OllamaOllamaProvider (LiteLLMProvider)Model-dependentLocal; typically ollama/<model>
Google GeminiGeminiProvider (LiteLLMProvider)Yes (model-dependent)LiteLLM gemini/ IDs
Kimi / MoonshotKimiProvider, MoonshotProviderNo (typical chat SKUs)Dedicated HTTP adapter; long context
MiniMaxMiniMaxProvider, MinimaxProviderNo for M2* chat lineopenai/ prefix applied internally; M2 family has no image/audio input
VolcEngine ArkArkLLMProvider, ArkProvider, VolcEngineProviderModel-dependentByteDance Doubao / Ark endpoints
Zhipu GLMZhipuProvider, ZhiPuProviderYes on GLM-4V-classDedicated adapter
Baidu QianfanQianfanProvider, QianFanProviderModel-dependentMay require secret_key in config
Alibaba Bailian / DashscopeBailianProvider, DashscopeProviderModel-dependentQwen-VL and cousins when configured
SiliconFlowLiteLLMProviderModel-dependentPoint base_url at SiliconFlow OpenAI-compatible API
LiteLLM (generic)LiteLLMProviderModel-dependentAny LiteLLM-supported backend
Azure OpenAILiteLLMProviderModel-dependentazure/ models; api_version + Azure key
DeepSeekLiteLLMProviderModel-dependentVia LiteLLM routing
GroqLiteLLMProviderModel-dependentVia LiteLLM groq/ models
MistralLiteLLMProviderModel-dependentVia LiteLLM mistral/ models
Together AILiteLLMProviderModel-dependentVia LiteLLM
xAILiteLLMProviderModel-dependentVia LiteLLM

ProviderResolver.PROVIDER_MAP wires config keys openai, anthropic, ollama, zhipu, volcengine / ark, bailian, qianfan, kimi, minimax to the classes above.


Usage

OpenAI

python
1from agenticx.llms import OpenAIProvider
2
3llm = OpenAIProvider(
4 model="gpt-4o",
5 api_key="sk-...", # or rely on env / config
6)
7resp = llm.invoke("Summarize this in one line.")

Anthropic

python
1from agenticx.llms import AnthropicProvider
2
3llm = AnthropicProvider(
4 model="anthropic/claude-sonnet-4-20250514",
5 api_key="sk-ant-...",
6)
7resp = llm.invoke([{"role": "user", "content": "Hello"}])

Ollama (local)

python
1from agenticx.llms import OllamaProvider
2
3llm = OllamaProvider(
4 model="ollama/qwen2.5:7b",
5 base_url="http://127.0.0.1:11434",
6)
7resp = llm.invoke("Ping")

MiniMax

python
1from agenticx.llms import MinimaxProvider
2
3llm = MinimaxProvider(
4 model="MiniMax-M2.5",
5 api_key="...",
6 # base_url defaults to https://api.minimax.chat/v1
7)
8resp = llm.invoke("Reply with OK.")

LiteLLM

python
1from agenticx.llms import LiteLLMProvider
2
3# Example: third-party OpenAI-compatible endpoint (e.g. SiliconFlow)
4llm = LiteLLMProvider(
5 model="openai/Qwen/Qwen2.5-7B-Instruct",
6 api_key="...",
7 base_url="https://api.siliconflow.cn/v1",
8)
9resp = llm.invoke("Hello")

Auth profile rotation

AuthProfileManager (agenticx.llms.auth_profile) rotates multiple API keys (or logical profiles) for the same provider. It persists cooldown metadata to a JSON file (atomic write via .tmp then replace).

  • `get_current()` — picks the next usable profile (available first, ordered by last_used; cooling profiles queued by cooldown_until).
  • `mark_success(profile_name)` — clears error state and cooldown timestamps for that profile.
  • `mark_failure(profile_name, failure_type)` — increments error_count, sets failure_type, and applies exponential backoff to cooldown_until.
  • `classify_failure(exc)` — maps exceptions to billing, auth, rate_limit, or other using message heuristics.

Backoff (implemented in _compute_cooldown_ms):

Failure bucketBaseCapMultiplier per step
billing5 hours24 hours2 ** min(error_count - 1, 10)
rate_limit, auth, other, …60 seconds1 hour5 ** min(error_count - 1, 3)

BaseLLMProvider.invoke_with_profile(messages, api_key=...) forwards to invoke(..., api_key=api_key) so callers can inject the rotated secret without replacing the provider instance.

!!! tip

Pass persistence_path=Path("~/.agenticx/auth_profiles.json").expanduser() if cooldowns must survive process restarts.


Failover routing

`FailoverProvider` wraps two providers: a primary and a fallback. For each of invoke, ainvoke, stream, astream, and stream_with_tools, it tries the primary unless the primary is in cooldown.

  • `failure_threshold` (default 3) — consecutive primary failures before entering cooldown.
  • `cooldown_duration` (default 60 seconds) — primary bypass window after the threshold is hit.
  • A successful primary call resets the failure counter and clears cooldown.
python
1from agenticx.llms import FailoverProvider, OpenAIProvider, AnthropicProvider
2
3llm = FailoverProvider(
4 primary=OpenAIProvider(model="gpt-4o", api_key="..."),
5 fallback=AnthropicProvider(model="anthropic/claude-sonnet-4-20250514", api_key="..."),
6 failure_threshold=3,
7 cooldown_duration=120.0,
8)

Response cache

`ResponseCache` is an in-memory store keyed by SHA-256 (truncated) of the string prompt. Entries carry a TTL (`ttl_seconds`, default `300`) and LRU eviction (`max_entries`, default `100`). It is not wired automatically into LiteLLMProvider; wrap calls when you want cheaper dev loops.

python
1from agenticx.llms import OpenAIProvider, ResponseCache
2
3llm = OpenAIProvider(model="gpt-4o-mini", api_key="...")
4cache = ResponseCache(ttl_seconds=300, max_entries=100)
5
6def cached_invoke(text: str):
7 hit = cache.get(text)
8 if hit is not None:
9 return hit
10 out = llm.invoke(text)
11 cache.put(text, out)
12 return out

stats() exposes hits, misses, size, and hit rate.


Transcript sanitizer

Before model calls, `agent_runtime` runs `_sanitize_context_messages` on chat history. The pipeline is provider-aware in the broader sense that it enforces valid assistant / tool message chains so upstream APIs do not see orphaned tool rows or dangling tool_calls.

Behavior (simplified):

  • `tool` messages are kept only when their tool_call_id appears on a preceding assistant tool_calls list that is fully satisfied by contiguous tool responses.
  • Assistant messages with `tool_calls` are kept only when every call id has a matching tool response in history; otherwise tool_calls are stripped and text content is preserved.

This reduces provider 400 errors from broken tool loops after edits, retries, or partial persistence.


Vision and image input

Multimodal content is honored when the backend and model support it. MiniMax's M2 chat family (including M2, M2.1, M2.5, M2.7 and `*-highspeed` SKUs; excluding ids containing `vl` / `vision`) does not accept image or audio input per vendor constraints.

Studio strips image_inputs for those models before the completion request. Prefer a vision-capable model if attachments must reach the LLM.

!!! warning "MiniMax M2 and attachments"

Do not assume the model sees images when using minimax-m2* IDs. The framework removes image payloads for that family and you should treat the turn as text-only unless you switch model.


Provider configuration (environment variables)

Values below are the ones AgenticX's config loader commonly pairs with providers.<name> in ~/.agenticx/config.yaml. LiteLLM may read additional variables depending on the model id.

VariableUsed for
OPENAI_API_KEYOpenAI
OPENAI_API_BASEOpenAI-compatible override (optional)
ANTHROPIC_API_KEYAnthropic
ANTHROPIC_API_BASEAnthropic base URL override (optional)
ZHIPU_API_KEYZhipu
ARK_API_KEYVolcEngine Ark (volcengine / ark provider)
VOLCENGINE_ACCESS_KEY, VOLCENGINE_SECRET_KEYAlternate Ark / Volcengine auth paths
DASHSCOPE_API_KEYAlibaba Bailian / Dashscope
QIANFAN_ACCESS_KEYBaidu Qianfan (secret_key often set in YAML)
MOONSHOT_API_KEYKimi / Moonshot
MINIMAX_API_KEYMiniMax
AGX_MAX_TOOL_ROUNDSRuntime cap on tool rounds (global)
AGX_CHROMIUM_QUIETDesktop Chromium log noise (optional)

Ollama is usually configured with base_url in YAML (for example http://localhost:11434); keys are not required for local inference.

!!! tip

Resolve providers in one line with ProviderResolver.resolve() when you want the merged file config instead of manual constructors.