Hybrid Routing & Cost Optimization
Muxa can steer each request to different providers without requiring any IDE changes. Set MUXA_ROUTING_STRATEGY=hybrid and point local tools at http://localhost:8081. The proxy inspects every request, computes a complexity score, and decides whether to stay on the primary provider or pivot to the fallback.
Configuration recap
MUXA_ROUTING_STRATEGY=hybrid # or single
MUXA_PRIMARY_PROVIDER=openrouter # cheap/fast model
MUXA_FALLBACK_PROVIDER=anthropic # premium model
Hybrid mode requires both providers to be defined. The fallback must be a cloud provider (Ollama cannot be the fallback).
How complexity is calculated
Two request builders feed into the router:
| Path | Score formula |
|---|---|
/v1/responses (OpenAI wire API) |
score = round(promptLength / 400) + (requiresTools ? +2 : 0) |
/v1/messages (Anthropic wire API) |
score = messageCount + (toolUseCount * 2) + round(textLength / 500) |
promptLength/textLengthis the total character count of assistant + user content.requiresTools/toolUseCountis detected by scanningtool_use/function_callblocks.
The router marks a request as “complex” when score > 3 or when the request explicitly requires tools. Complex requests automatically route to the fallback provider. Simple requests stay on the primary provider.
Example scenarios
| Scenario | Details | Computed score | Route |
|---|---|---|---|
| Quick prompt | messages=[{user: "hello"}] |
round(5/400)=0 |
Primary |
| Medium edit | 2 messages, ~800 chars total | round(800/400)=2 |
Primary |
| Tool execution | Assistant contains tool_use |
round(len/400)+2 ≥ 3 |
Fallback |
| Large spec | 10 messages, ~2500 chars | round(2500/400)=6 |
Fallback |
Observability
- Every response includes routing headers:
x-muxa-provider,x-muxa-route,x-muxa-routing-score. /routing/statsshows the last 25 routing decisions.- Structured logs include
route.providerandroute.usedFallback.
Tuning strategies
The current thresholds are intentionally conservative (score > 3). To bias toward the fallback sooner, shorten prompts or add tool metadata so the score rises faster. To bias toward the primary, ensure requests stay under a few kilobytes and avoid unnecessary tool requests.
If you need a different policy, adjust computeComplexityScore in src/routing/index.js or wrap executeWithRouting with your own scoring logic—the proxy architecture keeps all changes server-side, so IDEs don't need new configuration.
Checklist for reliable hybrid routing
- Define both providers in
.envand setMUXA_ROUTING_STRATEGY=hybrid. - Run
npm startand verify/routing/statsshows entries for both providers. - Trigger a tool-heavy action (e.g.,
cat README.md) and confirmx-muxa-route=fallback. - Monitor costs with
/api/tokens/statsand/metricsto ensure savings match expectations.