Architecture Overview
QuantumVerifi is designed as a set of stateless services backed by managed datastores. This architecture supports horizontal scaling, zero-downtime deployments, and multi-region operation.
Component Diagram
┌──────────────────────┐
│ Load Balancer │
│ (Ingress / ALB) │
└──────┬───────┬───────┘
│ │
┌────────────┘ └────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Web Frontend │ │ API Server │
│ (Next.js) │ │ (Go / Gin) │
│ Port 3000 │ │ Port 8080 │
└─────────────────┘ └────────┬─────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Worker │ │ PostgreSQL │ │ Redis │
│ (Temporal) │ │ │ │ │
└──────┬──────┘ └──────────────┘ └──────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ LLM Gateway │ │ Sandbox │ │ Object │
│ (LiteLLM) │ │ Pods │ │ Storage │
└──────────────┘ └──────────┘ └──────────────┘Service Details
API Server
The API server handles all HTTP requests — analysis creation, result retrieval, billing, authentication, and SSE progress streaming.
- Stateless — all state is in PostgreSQL and Redis
- Horizontally scalable — run 2+ replicas behind a load balancer
- Authentication — JWT-based with configurable identity provider
Worker
Workers execute analysis workflows using durable orchestration. Each analysis is a multi-step workflow that survives infrastructure restarts.
- Auto-scaled — scale based on queue depth
- Heartbeat monitoring — long-running analyses send periodic heartbeats
- Retry logic — failed steps are retried with exponential backoff
Web Frontend
The Next.js frontend provides the dashboard, analysis viewer, settings, and billing UI.
- Server-side rendering — fast initial page loads
- Real-time updates — SSE streaming for analysis progress
LLM Gateway
Routes LLM requests to configured providers with automatic failover:
- Primary — Azure OpenAI (GPT-4.1 family)
- Secondary — Anthropic (Claude Sonnet / Haiku)
- Tertiary — Ollama (self-hosted, for air-gapped deployments)
Rate limiting, response caching, and cost tracking are handled at this layer.
Sandbox Execution
Tests run in ephemeral containers with per-language runtimes:
| Runtime | Languages | Pre-installed tools |
|---|---|---|
| Node.js 20 | JavaScript, TypeScript | Jest, Vitest, Playwright |
| Python 3.12 | Python | Pytest, coverage |
| Go 1.25 | Go | go test |
| Java 21 | Java, Kotlin | Maven, JUnit |
| Rust 1.83 | Rust | cargo test |
| Universal | Multi-language | Node + Python + Go |
Sandboxes are created on demand, run for the duration of the test, and are destroyed immediately after — no persistent state, no shared resources.
Data Flow
- User submits analysis via API or dashboard
- API server creates a workflow and enqueues it
- Worker picks up the workflow, clones the repo
- Worker calls LLM Gateway for test generation
- Generated tests are sent to a sandbox for execution
- Results are stored in PostgreSQL + Object Storage
- Evidence chain events are hashed and linked
- SSE events stream progress back to the frontend
Scaling Guidelines
| Component | Scaling trigger | Recommended |
|---|---|---|
| API Server | Request rate | 2-4 replicas |
| Worker | Queue depth | 2-8 replicas (auto-scaled) |
| Web Frontend | Traffic | 2-3 replicas |
| PostgreSQL | Connection count | Managed service with read replicas |
| Redis | Memory usage | Managed service |
| Object Storage | Storage volume | Managed S3-compatible service |
Network Requirements
| Source | Destination | Port | Purpose |
|---|---|---|---|
| Web | API Server | 8080 | API requests |
| API Server | PostgreSQL | 5432 | Data storage |
| API Server | Redis | 6379 | Queues and cache |
| Worker | LLM Gateway | 4000 | AI generation |
| Worker | Sandbox | Dynamic | Test execution |
| LLM Gateway | LLM Provider | 443 | LLM API calls |
| Sandbox | Object Storage | 9000 | Artifact upload |