engage-prodEngage Inc.Console
Billing cycle · 2 days remaining·$32.2K / $50.0K

Models · inference

Models & inference

Registry, deployment topology, and inference performance for every deployed model.

Models in production

5
+2 in staging · 1 training

Requests · 24h

1.8M
+8.4% DoD

Avg p50 latency

418ms
-22ms vs 7d avg

Avg p99 latency

1.84s
+86ms vs 7d avgwithin SLO

All deployed and staged models

Model
Kind
Status
Throughput
p50 / p99
Ctx
Philotic-1v2.4.1

self-hosted · 72B

Orchestrator
production
86.4M tok/hr
412ms / 1840ms
128K
Claude Sonnetv4.6

Anthropic · API

Reasoning
production
42.0M tok/hr
612ms / 2480ms
200K
Cali-OEMv3.0

Phil-1 ⊃ Caliber OEM corpus

Fine-tune
production
31.2M tok/hr
380ms / 1420ms
64K
Cali-Insurancev2.1

Phil-1 ⊃ insurance workflow

Fine-tune
production
18.4M tok/hr
396ms / 1560ms
64K
Embed-Sovereignv1.4

self-hosted · 768d

Embedding
production
6.2M tok/hr
38ms / 142ms
8K
Llamav3.3 · 70B

open · self-hosted

Reasoning
staging
2.4M tok/hr
540ms / 1980ms
128K
Banking-Arabicv1.0-rc

Phil-1 ⊃ MENA banking

Training
training
64K

Request latency histogram across all production models

p50 418ms·p95 1240ms·p99 1840ms
<200ms4.2K200-40012.5K400-60018.2K600-80014.3K800-10009.6K1000-15006.1K1500-20002.8K2000-30009803000+240

Active traffic split

Philotic-1 v2.4.168%
Cali-OEM v3.018%
Cali-Insurance v2.19%
Claude Sonnet 4.65%

Failover ready: Llama 3.3 70B

Last canary: 2026-04-26 · 0 regressions

020.0M40.0M60.0M80.0M100.0Mtokens / dayApr 5Apr 12Apr 19Apr 26Inference tokens

p50 latency against $/MTok — shape encodes model kind, hover for detail

$0.00$0.50$1.00$1.50$2.00$2.50$3.00$3.500ms100ms200ms300ms400ms500ms600msp50 latency (ms)$ / MTok
Philotic-1 v2.4.1
184 rps / 250
0.04% errors
Claude Sonnet v4.6
96 rps / 250
0.08% errors
Cali-OEM v3.0
142 rps / 250
0.10% errors
Cali-Insurance v2.1
78 rps / 250
0.11% errors
Embed-Sovereign v1.4
1240 rps / 250
0.01% errors