安全公告 — 恶意 litellm 版本 1.82.7 与 1.82.8 已从 PyPI 移除(存在 API 密钥外泄风险)。请卸载、轮换已暴露凭据,并升级至安全版本(如 1.82.9+)。运行 pip show litellm 以确认。 PyPI · README

Protocol Translation Architecture

Enterprise Gateway uses an OpenAI-compatible chat completion shape as the internal pivot DTO (openai.ChatCompletionRequest / StreamChunk).

Protocol Translation Architecture

Enterprise Gateway uses an OpenAI-compatible chat completion shape as the internal pivot DTO (openai.ChatCompletionRequest / StreamChunk).

Why pivot

  • Channel relay and policy/quota already operate on pivot requests.
  • Cross-protocol conversion stays O(protocols) via inbound (X→pivot) + outbound (pivot→X) instead of N×N matrices.

Request path

1Client wire format
2 → inbound.Parse*()
3 → transform.ResolveModel() (reasoning effort / thinking budget)
4 → policy + quota
5 → relay.Executor → adaptor (upstream wire)
6 → pivot stream chunks
7 → outbound encoder (client wire)

Packages

PackageRole
internal/inboundClaude / Gemini / Responses → pivot
internal/outboundpivot → client SSE/JSON
internal/transformModel suffix rules, tools mapping, thinking modes
internal/adaptorpivot ↔ upstream provider APIs

Reasoning effort suffixes

Examples: gpt-5-high, o3-mini-low, claude-3-7-sonnet-thinking, gemini-2.5-flash-thinking-128.

Rules live in transform/reasoning_effort.go and inject reasoning_effort or thinkingBudget into pivot/upstream payloads.

Non-goals (this phase)

  • Realtime WebSocket
  • Full Responses previous_response_id chain
  • Gemini→OpenAI function calling parity

Made-with: Damon Li