Multi-Level LLM + SLM Routing
The 8-factor decision engine. Every turn is scored across privacy, complexity, domain match, urgency, cost, reasoning depth, context, and clarity. PHI-bearing turns route to the on-prem SLM; complex reasoning routes to a frontier LLM. Same engine, healthcare vocabulary.
Bounteous Fine-Tuned HCSC SLM
A small language model (Mistral-7B) fine-tuned via LoRA on HCSC-shaped policy content — Summary Plan Descriptions, Evidence of Coverage, provider manuals, denial-code references (CARC / RARC), prior-auth criteria, and appeals procedures. Quantised to GGUF and runs on-prem via llama.cpp so PHI never leaves the HCSC environment.
Best at: eligibility lookups · benefit / copay / deductible questions · denial-code translation · routine claim status · prior-auth status · appeals-process explanations · plain-language re-framing of policy text.For multi-step clinical reasoning, peer-to-peer prep, or open-ended interpretation, the router escalates to a frontier LLM after PHI redaction.
In this prototype the SLM responses are pre-canned to keep the demo fully static. The same fine-tuning pipeline produces a live local inference service for the production phase.