LOKI
Desktop-Native Voice Assistant with Hybrid NLU
99.77%
NER F1-Score
99.77%
Precision
1389
Samples
Overview
A privacy-first, offline-capable voice assistant that outperforms cloud-dependent alternatives in latency and system integration. Features a novel hybrid NLU engine that combines a fast embedding-based classifier with a local LLM fallback for complex intent understanding.
Architecture
┌─────────────────────────────────────────────────────────────┐ │ LOKI Hybrid NLU Pipeline │ ├─────────────────────────────────────────────────────────────┤ │ ┌─────────────┐ │ │ │ Voice Input │ │ │ └──────┬──────┘ │ │ │ (Faster-Whisper) │ │ ▼ │ │ ┌─────────────┐ Confidence > 0.6? ┌────────────────┐│ │ │ Transcript │ ───────────┬──────────▶ │ FastClassifier ││ │ └─────────────┘ │ │ (Embeddings) ││ │ │ └───────┬────────┘│ │ │ │ │ │ ▼ │ │ │ ┌────────────────┐ │ │ │ │ LLM Fallback │ │ │ │ │ (Ollama/Phi) │ │ │ │ └────────┬───────┘ │ │ │ │ │ │ │ ▼ ▼ │ │ ┌────────────────────────────────────┐ │ │ │ Intent & Parameter Merger │ │ │ └────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘
Technical Decisions
| Decision | Trade-off | Outcome |
|---|---|---|
| Hybrid NLU Architecture | Complexity vs Latency/Accuracy | Used 'FastClassifier' (Embeddings) for common commands (<60ms) and 'LLMClassifier' (Ollama) only for complex queries, balancing speed and flexibility. |
| Synthetic Data Generation | Realism vs Training Volume | Generated 1389 labeled sentences to train the CRF model, achieving 99.77% F1-score on parameter extraction without expensive manual labeling. |
| Agent-Based Dispatch | Monolithic vs Modular | Decoupled NLU from execution. New capabilities (e.g., Volume Control, Calculator) can be added as independent agents without retraining the core model. |