LLM gateway
An LLM gateway is a layer that sits between your application and model providers to handle routing, caching, fallbacks, observability, and cost control.
An LLM gateway (or AI gateway) centralizes calls to one or more model providers. Instead of each service calling provider APIs directly, they call the gateway, which adds cross-cutting concerns: routing, caching, retries and fallbacks, rate limiting, logging, and cost tracking.
A gateway gives you a single place to change models, enforce policy, and see spend by route — which is exactly what makes cost and quality manageable as usage grows.
Gateways can be self-hosted (for data residency and control) or managed; the right choice depends on your data and governance needs.