37 KiB
ADR-001: The Capability Architecture
AI-LSC v3.0 — Ankh of Jah
This is the single architectural definition for AI-LSC. Every module, every template, every resolver path either implements something defined here or it does not belong.
Status
Accepted. Adopted as the foundational architecture for v3.0 (Ankh of Jah) and all subsequent releases. The agentic execution layer is deferred to v4.0.
1. Context
AI-LSC did not begin as an architecture. It began as a question:
"Can I stop manually juggling a dozen AI tools on a Linux machine?"
v1 answered: yes, with a monolithic script. v2 answered: yes, with a modular registry and layers. v3 answers a different question entirely:
"Can a system understand AI infrastructure well enough to deploy, validate, diagnose, and reproduce it — without the operator thinking about individual tools?"
The shift is from tool-first to system-first. Earlier development asked "how do we add support for X?" Current development asks "where does X belong in the architecture?" That is not a cosmetic change. It is a phase change.
Three releases revealed a consistent pattern: the same architectural verbs kept reappearing across unrelated features. Install, verify, configure, launch, monitor, export, diagnose, reproduce. Every tool needed them. Every stack needed them. Every container needed them. The repetition was not a failure to abstract — it was evidence of an abstraction waiting to be named.
This document names it.
2. The Foundational Object: Capability
Every system has one concept that, if removed, causes the entire structure to collapse. For AI-LSC, that concept is Capability.
A Capability is a named, validated unit of infrastructure that a machine either possesses or does not. It is not a tool. It is not a process. It is not a package. It is a statement about the machine.
"Inference" — this machine can run LLM inference.
"Vector Store" — this machine can store and query embeddings.
"Monitoring" — this machine can observe its own services.
"GPU Compute" — this machine has CUDA/cuDNN available.
Capabilities are discovered, not declared. A tool provides capabilities. A template requires capabilities. A pipeline consumes capabilities. A container exports capabilities. A dashboard reports capabilities. A skill extends capabilities. Monitoring validates capabilities.
Every subsystem points at Capability. No subsystem points at Tool directly except the Registry, which maps tools to the capabilities they provide.
This single inversion eliminates most of the coupling in the application:
Tool ──provides──► Capability ◄──requires── Template
▲
Pipeline ──consumes──────┘
▲
Container ──exports───────┘
▲
Dashboard ──reports───────┘
▲
Skill ──extends───────┘
▲
Monitoring ──validates─────┘
Swap Ollama for vLLM. Swap Grafana for another observability stack. Swap Qdrant for Milvus. Everything above the Registry layer does not notice. The capability model remains stable even when implementations evolve, technologies are replaced, or entirely new categories of AI software emerge.
3. The Architecture Pipeline
AI-LSC is not an installer. It is a pipeline from intent to infrastructure.
┌──────────────────────────────────────────────────────────────────┐
│ USER INTENT │
│ │
│ "I want a Research Workstation" │
│ "I want a RAG Server" │
│ "I want a GPU Inference Cluster" │
│ "I want a Coding Assistant" │
└────────────────────────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ TEMPLATE (Recipe) │
│ Desired Architecture │
│ │
│ Research Workstation │ RAG Appliance │ Inference Node │
└────────────────────────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ RESOLVER │
│ Infrastructure Planning │
│ │
│ • Detect hardware • Detect OS │
│ • Detect installed sw • Detect conflicts │
│ • Expand dependencies • Select implementations │
│ • Produce execution plan │
└────────────────────────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ REGISTRY │
│ Individual Components │
│ │
│ Every tool knows: Install · Update · Verify · Launch │
│ Health · Configure · Container · Export │
└────────────────────────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ RUNTIME │
│ │
│ Native · Podman · Docker · LXC · Cluster · Remote │
└──────────────────────────────────────────────────────────────────┘
The Resolver is the brain. It is the only component that translates between the declarative world of templates and the imperative world of package managers, container runtimes, and service launchers. No other component performs this translation. This constraint ensures that adding a new runtime target (say, Kubernetes) requires changes only in the Registry (new tool entries) and Runtime (new executor), never in templates or pipelines.
4. Stack Recipes (Templates as Intent)
4.1 What a Template Is
A template is infrastructure intent, not an install script. It declares what the operator wants the machine to become. It does not duplicate install logic, configuration logic, or launch logic — the Registry already owns all of that.
The current template format is a flat list of tool IDs. This is functional but insufficient for the capability architecture. The evolved format — the Stack Recipe — declares capabilities, roles, connections, and startup semantics:
# Stack Recipe — evolved template format (v4.0 target)
stack:
name: Claude Memory Assistant
version: "1.0"
maturity: official # official | community | local | frozen
capabilities:
required:
- inference # needs an LLM engine
- vector_database # needs embedding storage
- relational_database # needs structured storage
- web_interface # needs a browser-accessible UI
optional:
- monitoring
- automation
components:
inference:
engine: ollama
model: llama3
memory:
vectordb: qdrant
embedding_model: nomic-embed-text
database:
engine: postgres
ui:
provider: open_webui
connections:
- from: inference
to: vector_database
protocol: embedding
- from: inference
to: relational_database
protocol: session_store
- from: ui
to: inference
protocol: openai_compat
startup:
order:
1. relational_database
2. vector_database
3. inference
4. ui
health_wait:
- relational_database # UI waits until DB is accepting connections
- vector_database
- inference
health:
checks:
- capability: inference
probe: GET /api/tags
- capability: vector_database
probe: GET /collections
4.2 What a Template Is Not
A template does not contain:
- Installation commands (the Registry knows how to install)
- File paths (the Resolver knows the layout)
- Port assignments (conflict detection is automatic)
- OS-specific logic (the Resolver handles this)
- Dependency installation order beyond what
startup.orderdeclares
A template also does not hardcode implementations. It specifies roles:
components:
vector_database:
role: vector_store # NOT "qdrant"
The Resolver maps vector_store to whatever provider is installed or
available. On one machine that is Qdrant. On another it is Milvus. On a
third the Resolver recommends Chroma. The template never changes.
4.3 Template Maturity
Templates have a maturity level that signals trust and intent:
| Level | Meaning | Use Case |
|---|---|---|
| Official | Maintained by the AI-LSC project | Curated reference stacks |
| Community | Shared by users, reviewed | Experimentation, collaboration |
| Local | Created by the operator | Personal workflows, one-off stacks |
| Frozen | Exact snapshot of a validated environment | Reproducibility, CI/CD, audit |
A Frozen template pins every version, every config hash, every capability signature. Deploying a Frozen template on a different machine produces a bit-for-bit equivalent environment. This is the mechanism for long-term reproducibility — not containerization alone, but declarative infrastructure with verified provenance.
5. Role-Based Resolution
The critical distinction between AI-LSC and every other "AI launcher" is that templates specify roles, not implementations.
A role is a capability category with multiple possible providers:
Role: Inference Engine
Providers: Ollama · llama.cpp · vLLM · TensorRT-LLM · LM Studio
Role: Vector Database
Providers: Qdrant · Chroma · Milvus · Weaviate · FAISS
Role: LLM Gateway
Providers: LiteLLM · OpenRouter · Local proxy
Role: Monitoring
Providers: Grafana + Prometheus · Glances · Netdata
Role: Agent Frontend
Providers: Open WebUI · LibreChat · AnythingLLM · Continue
The Resolver performs role resolution in this order:
- Already installed? Use what is present.
- Compatible with hardware? Select the best fit (GPU → CUDA-aware provider).
- Template preference? Honor explicit provider hints.
- Fallback chain. Try each candidate in order.
- Recommend. If nothing installs cleanly, report what is needed.
This means a single template shared between two machines can resolve to completely different toolsets:
"Research Workstation" template
Laptop (CPU-only):
→ llama.cpp (CPU inference)
→ LiteLLM (gateway)
→ Chroma (lightweight vector store)
→ Open WebUI (interface)
Desktop (RTX 4090):
→ Ollama (CUDA inference)
→ vLLM (high-throughput serving)
→ Qdrant (production vector store)
→ LibreChat (multi-provider interface)
Same template. Different reality. The Resolver is what makes that work.
6. Component Connections
Installing tools side-by-side is not an architecture. Understanding how they interact is.
The Stack Recipe format includes a connections section that declares
relationships between components. These are not just documentation — they are
inputs to the Stack Doctor (Section 12) and the Resolver's validation
engine.
A connection declaration:
connections:
- from: ui # Open WebUI
to: inference # Ollama
protocol: openai_compat # Expects OpenAI-compatible API
- from: ui
to: vector_database
protocol: embedding # Needs embedding endpoint
The Resolver uses connections to:
- Validate that protocols are compatible (OpenAI-compat ↔ OpenAI-compat).
- Detect likely misconfigurations (OLLAMA_HOST=localhost when UI is remote).
- Generate connection-specific health checks.
- Produce diagnostic suggestions when connections fail.
This is dependency injection for infrastructure. The template declares the graph. The Resolver validates the graph. The Runtime instantiates the graph.
7. The 13-Layer Model
AI-LSC organizes all AI infrastructure into 13 layers. Each layer represents a category of capability. Tools register into one (sometimes two) layers. Templates reference layers instead of individual tools when expressing broad requirements.
Layer 1 Host Platform — OS, kernel, filesystem, base packages
Layer 2 Development Env — Python, Rust, Node.js, Go, build tools
Layer 3 GPU Runtime — CUDA, cuDNN, ROCm, Vulkan compute
Layer 4 Inference Engines — Ollama, llama.cpp, vLLM, TensorRT-LLM
Layer 5 Distributed Runtime — Ray, Kubeflow, cluster schedulers
Layer 6 AI Endpoints — LiteLLM, model routers, API gateways
Layer 7 Data & Knowledge — PostgreSQL, MariaDB, data pipelines
Layer 8 Knowledge Management — Qdrant, Chroma, Milvus, vector stores
Layer 9 Automation & Execution — n8n, Airflow, task schedulers
Layer 10 Observability — Prometheus, Grafana, Glances, logging
Layer 11 Intelligent Routing — Fabric, Hermes, agent dispatchers
Layer 12 User Interfaces — Open WebUI, LibreChat, AnythingLLM
Layer 13 Containers — Podman, Docker, LXC, export targets
A template can express requirements by layer:
capabilities:
layers:
- Inference Engines # Layer 4
- AI Endpoints # Layer 6
- Knowledge Management # Layer 8
- User Interfaces # Layer 12
The Resolver fills in everything else. If the template needs inference (Layer 4) and the host has no GPU (Layer 3), the Resolver knows to recommend CPU-only providers and skip CUDA-dependent tools automatically.
Stress Test
The 13-layer model must accommodate any AI project without forcing it. A non-exhaustive validation set:
| Project | Natural Layer Fit |
|---|---|
| Open WebUI | 12 (User Interfaces) |
| LiteLLM | 6 (AI Endpoints) |
| Qdrant | 8 (Knowledge Management) |
| Ollama | 4 (Inference Engines) |
| vLLM | 4 (Inference Engines) |
| ComfyUI | 12 (User Interfaces) |
| Flowise | 12 (User Interfaces) |
| n8n | 9 (Automation & Execution) |
| Prometheus | 10 (Observability) |
| Ray | 5 (Distributed Runtime) |
| Langflow | 12 (User Interfaces) |
| Chroma | 8 (Knowledge Management) |
| Milvus | 8 (Knowledge Management) |
| llama.cpp | 4 (Inference Engines) |
| TensorRT-LLM | 4 (Inference Engines) |
| OpenHands | 12 (User Interfaces) |
| Aider | 2 (Development Env) |
| Continue | 2 (Development Env) |
| Kubeflow | 5 (Distributed Runtime) |
| Kafka | 7 (Data & Knowledge) |
Every project in the validation set fits naturally into exactly one layer. None require special casing. The model appears to generalize well.
8. Skills as Derived Capabilities
Skills are not file lookups. They are capability queries.
The old model: "Does this Python file exist in the skills directory?" The new model: "Does this machine currently possess this capability?"
Skills derive from deployed, validated infrastructure:
Template: Research Workstation
│
▼ Deployed
│
▼ Verified
│
▼ Registered as Capabilities
│
▼ Skills become available:
│
├── "Local RAG" (has: inference + vector_store + ui)
├── "Python AI" (has: development + inference)
├── "Vision" (has: inference + multimodal_model)
├── "Speech" (has: inference + whisper + tts)
└── "Distributed Inference" (has: inference + distributed_runtime)
A skill definition references capabilities, not tools:
skill:
name: Local RAG
requires:
capabilities: [inference, vector_database, web_interface]
optional:
capabilities: [monitoring, relational_database]
description: >
End-to-end retrieval-augmented generation using local models.
Available when the machine has an inference engine, a vector store,
and a web interface — regardless of which specific tools provide them.
This means installing a new tool that provides an existing capability can silently unlock skills the operator never explicitly configured. Replace Qdrant with Milvus and every RAG skill still works, because the capability did not change — only the provider did.
9. Pipelines Consume Capabilities
A pipeline is a directed graph of capability requirements. It never names a tool. It names what it needs:
Pipeline: Document RAG
[Source] → [Chunking] → [Embedding] → [Vector Store] → [Retriever] → [LLM] → [Output]
Each node is a capability. The Resolver maps each node to a tool at runtime:
Embedding:
→ nomic-embed-text (via Ollama)
or
→ bge-small (via llama.cpp)
Vector Store:
→ Qdrant
or
→ Chroma
LLM:
→ Ollama (llama3)
or
→ vLLM (deepseek-coder-33b)
The pipeline graph never changes when implementations change. This is what makes pipelines portable across machines, containers, and clusters.
10. Container Export as Capability Export
A container image is not a bag of tools. It is a frozen capability set.
When an operator exports a Research Workstation to Podman, the exported image carries a capability manifest alongside the filesystem layers:
Research_Workstation_v1.0
Capabilities:
✓ Inference (Ollama, llama3)
✓ GPU Compute (CUDA 12.4, cuDNN 9.1)
✓ Vector Database (Qdrant)
✓ LLM Gateway (LiteLLM)
✓ Web Interface (Open WebUI)
✓ Monitoring (Prometheus + Grafana)
✓ Relational Database (PostgreSQL)
Stack Recipe: embedded (frozen)
Template: Research Workstation v1.0
Exported: 2026-06-28
Architecture: x86_64
When another machine imports this image, AI-LSC reads the manifest and immediately knows what the container provides — no scanning, no probing, no guessing. The capabilities are declared, trusted, and verified.
Export targets are format-agnostic:
Recipe → Resolver → Generate Deployment
├── Podman Quadlet
├── Docker Compose
├── LXC Config
└── Kubernetes YAML (future)
The recipe never changes. Only the exporter changes.
11. Dashboards Report Capability Health
The dashboard does not display process status. It displays infrastructure health.
┌──────────────────────────────────────────────────────┐
│ Research Workstation ████████ 92%│
│ │
│ Host Platform ✓ │
│ Development Env ✓ │
│ GPU Runtime ⚠ CUDA Update Available │
│ Inference Engines ✓ Ollama · llama3 │
│ AI Endpoints ✓ LiteLLM :4000 │
│ Data & Knowledge ✓ PostgreSQL :5432 │
│ Knowledge Management ✓ Qdrant :6333 │
│ Automation — │
│ Observability ✓ Grafana · Prometheus │
│ Intelligent Routing ✓ Fabric │
│ User Interfaces ✓ Open WebUI :8080 │
│ Containers 2 specialist images │
│ │
│ Templates: 7 installed Skills: 12 available │
└──────────────────────────────────────────────────────┘
Each row is a capability, not a tool. The status reflects whether the machine possesses that capability in a healthy state, regardless of which tool provides it. If the operator swaps Grafana for Netdata, the Observability row still shows the same status — because the capability did not change.
12. Stack Doctor
The Stack Doctor is a reasoning engine, not a log viewer. It understands relationships between components and can diagnose problems that span multiple tools.
Example diagnosis:
DIAGNOSIS: Open WebUI cannot reach Ollama
REASON: OLLAMA_HOST is set to localhost (127.0.0.1)
but Open WebUI is configured to connect to port 11434
on all interfaces. Connection is refused.
RECOMMENDATION:
Option A: Set OLLAMA_HOST=0.0.0.0 in Ollama environment
Option B: Bind Open WebUI to localhost only
Option C: Route through LiteLLM proxy
Example conflict detection:
DIAGNOSIS: Port conflict detected
LiteLLM wants port 4000 ✓ (available)
vLLM wants port 8000 ✗ (occupied by TensorRT-LLM)
RECOMMENDATION:
Move LiteLLM to port 4001
or
Disable TensorRT-LLM if not needed
The Stack Doctor uses the connection graph from the Stack Recipe to trace problems across component boundaries. It does not just check if a process is running — it checks if the capability chain is intact from end to end.
13. Operator Workflows
13.1 Missions
Complex deployments are presented as Missions, not wizards. A Mission is a named, scoped objective with a clear completion state:
┌──────────────────────────────────────────────────────┐
│ MISSION: Build Coding Assistant │
│ │
│ Estimated effort: 8 minutes │
│ Status: Planning... │
│ │
│ [✓] Validate host platform │
│ [✓] Detect installed capabilities │
│ [→] Resolve missing dependencies │
│ [ ] Install Python (Layer 2) │
│ [ ] Install Ollama (Layer 4) │
│ [ ] Install LiteLLM (Layer 6) │
│ [ ] Install Open WebUI (Layer 12) │
│ [ ] Configure connections │
│ [ ] Verify health │
│ [ ] Export ready │
└──────────────────────────────────────────────────────┘
13.2 Routines
Routines are reusable infrastructure actions, not application macros:
| Routine | Actions |
|---|---|
| Morning Check | Verify all services, restart unhealthy, check updates, check GPU, check disk |
| Pre-Inference | GPU memory, temperature, ports, models, KV cache, endpoint ready |
| Before Export | Verify services, verify configs, clean logs, freeze versions, generate manifest |
| Before Commit | Lint, test, validate registry, validate templates, schema check |
One button. Comprehensive validation.
13.3 Next Best Action
AI-LSC suggests the operator's next step based on current state:
Good morning.
✓ GPU healthy
✓ Ollama healthy
⚠ Open WebUI update available (v0.3.12 → v0.3.14)
⚠ Research Workstation template has 1 missing dependency
Suggested: Verify Research Workstation
This is not AI. It is deterministic inference over the capability graph. The system knows what is installed, what is healthy, what is outdated, and what templates require. The recommendation follows directly.
13.4 Activity Timeline
Every infrastructure action is recorded with a timestamp:
09:13 Installed LiteLLM
09:15 Verified CUDA (driver 550.54, CUDA 12.4)
09:16 Generated template: Research Workstation
09:20 Exported Podman image: research_ws_v1.0
09:27 Health check passed (13/13 capabilities)
Timelines are queryable, filterable, and exportable. They provide audit trail and operational memory.
13.5 Workspaces
Workspaces group related infrastructure by purpose, not by tool:
Research → inference + vector_db + ui + monitoring
Coding → development + inference + endpoints + ui
RAG → inference + vector_db + relational_db + ui
Cluster → distributed + inference + monitoring + containers
Click a workspace. Everything related appears. One context for one purpose.
14. Adaptive Templates
A single template adapts to the host hardware, installed software, and available runtimes. The Resolver selects implementations based on constraints, not preferences.
"Research Workstation" on different hardware:
Laptop (CPU, 16GB RAM):
→ llama.cpp (quantized, CPU inference)
→ Chroma (in-process vector store, minimal memory)
→ LiteLLM (lightweight gateway)
→ Glances (lightweight monitoring)
→ Open WebUI (browser interface)
Desktop (RTX 4090, 64GB RAM):
→ Ollama (CUDA-accelerated inference)
→ Qdrant (production vector store with GPU-accelerated HNSW)
→ LiteLLM + vLLM (dual gateway: fast + thorough)
→ Prometheus + Grafana (full monitoring stack)
→ LibreChat (multi-provider interface)
Server (Dual MI300X, 256GB RAM):
→ SGLang (ROCm-optimized inference)
→ Milvus (distributed vector store)
→ LiteLLM (cluster gateway)
→ Prometheus + Grafana + AlertManager (production monitoring)
→ Open WebUI (load-balanced)
Same template. Same intent. Different reality. The Resolver is what makes the template portable.
15. Rationale
Why Capability as the central abstraction?
Because tools are ephemeral. The AI landscape changes monthly. New inference engines appear. Old ones are abandoned. Monitoring stacks get replaced. Vector databases get acquired and deprecated.
But the capabilities those tools provide are remarkably stable. "The machine can run LLM inference" has been true since 2023 and will be true in 2030. The implementation changes. The capability does not.
Building around capabilities means AI-LSC's architecture decays at the rate of the AI industry's conceptual evolution, not its tool churn. Conceptual evolution is orders of magnitude slower.
Why not just use Terraform / Kubernetes?
Because those tools solve a different problem. Terraform manages cloud infrastructure declaratively. Kubernetes orchestrates containers at scale. Neither understands that "install Qdrant" implies "the machine now has vector database capability" — nor should they. That is AI-LSC's domain.
AI-LSC is specifically designed for the local AI operator who needs to assemble, validate, and reproduce AI stacks on single machines or small clusters. It fills the gap between "install scripts" and "cloud orchestration."
Why role-based resolution instead of tool-specific templates?
Because a template that hardcodes Qdrant cannot run on a machine that only has Milvus. A template that hardcodes Ollama cannot leverage an existing vLLM installation. Role-based resolution makes templates portable, shareable, and future-proof without requiring the template author to anticipate every possible provider.
16. Consequences
Positive
- Tool swaps are zero-cost above the Registry. Replacing a provider requires only a new Registry entry with the same capability mapping. Templates, pipelines, skills, and dashboards are unaffected.
- Templates are shareable across heterogeneous hardware. The same recipe produces appropriate deployments on laptops, desktops, and servers.
- New capabilities can be added without modifying existing templates. Adding a "Speech-to-Text" capability does not require touching any Research Workstation template.
- Container exports carry semantic meaning, not just filesystem state. Importing a container immediately reveals its capabilities.
- Diagnostics can reason about relationships, not just individual process health.
Neutral
- The Resolver is the most complex component. It must understand hardware detection, OS differences, dependency graphs, conflict resolution, and provider selection. This is acceptable because the Resolver is a single, well-bounded component.
- The capability vocabulary must be curated. New capabilities require consensus on naming, boundaries, and provider criteria. This is a governance concern, not a technical one.
Risks
- Over-abstraction. If the capability vocabulary is too coarse ("compute"), it loses discriminating power. If too fine ("qdrant-hnsw- gpu"), it reverts to tool-specific coupling. The granularity must be calibrated through real-world use.
- Resolver complexity. A naive Resolver that tries all combinations is NP-hard. The Resolver must use heuristics, caching, and constraint propagation to remain fast.
- Capability drift. As the AI ecosystem evolves, capabilities may split or merge. "Inference" might split into "Text Inference" and "Multimodal Inference." The architecture must handle capability evolution without breaking existing templates.
17. Architecture Completeness
Current state of implementation (v3.0 Ankh of Jah):
Registry (tool metadata, 115 tools) ████████████░ 95%
Templates (stack recipes, 4 templates) ██████░░░░░░ 55%
Resolver (dependency expansion, planning) ███░░░░░░░░░ 30%
Installer (native, git, npm, pip) ████████████░ 95%
Verification (install checks, health probes) ██████████░░░ 85%
Health (service status, GPU monitoring) ███████░░░░░ 65%
Export (Podman, Docker, LXC configs) ████████░░░░ 80%
Monitoring (glances integration, Prometheus) █████░░░░░░░ 50%
Skills (capability-derived skills) ███░░░░░░░░░ 25%
Pipelines (capability graph execution) ██░░░░░░░░░░ 20%
Dashboards (capability health display) ████░░░░░░░░ 35%
Stack Doctor (diagnostic reasoning) ██░░░░░░░░░░ 15%
Missions (guided deployment flows) █░░░░░░░░░░░ 10%
Workspaces (purpose-based grouping) ███░░░░░░░░░ 25%
Activity Timeline ██░░░░░░░░░░ 20%
Next Best Action █░░░░░░░░░░░ 10%
Documentation (this ADR, README, guides) ██████░░░░░░ 55%
Tests ██░░░░░░░░░░ 20%
The pattern is clear: the foundation (Registry, Installer, Verification) is strong. The intelligence layer (Resolver, Stack Doctor, Missions) is where the next investment goes. The UI layer (Dashboards, Workspaces, Timeline) follows the intelligence layer.
18. Feature Policy (Ankh of Jah Stabilization)
v3.0 enters a stabilization phase. Feature velocity decreases; stability velocity increases.
Allowed
- Bug fixes
- Registry additions (new tool metadata, new providers)
- New templates (stack recipes)
- Installer verification and hardening
- UI polish and usability improvements
- Documentation
- Tests
- Capability vocabulary refinement
- Resolver heuristic improvements
Not Allowed
- New architectural concepts
- New runtime systems
- Major UI redesigns
- New registry formats (schema changes)
- Agent execution (deferred to v4.0)
- Cluster orchestration (deferred to v4.0)
- Remote node management (deferred to v4.0)
v4.0 Scope (Deferred)
The agentic execution layer — where an LLM operates AI-LSC through function-calling, using the agents/ bridge to start/stop services, pull models, inject skills, and diagnose issues through natural language. This is architecturally designed (agents/ package exists, tool_bridge and ollama_tools are implemented, Redis pub/sub infrastructure is in place) but intentionally not activated in v3.0.
19. Project Philosophy
AI-LSC is a native-first, metadata-driven infrastructure manager for local AI systems. It treats AI software as reusable infrastructure rather than isolated applications, enabling reproducible deployments, validation, monitoring, and export of complete AI environments.
This single paragraph is the decision filter for every proposed feature. If a feature supports this philosophy — making AI infrastructure easier to deploy, validate, reproduce, and understand — it belongs. If it does not, it does not.
AI-LSC's biggest competitor is not another AI launcher. It is the manual process that most developers still follow: reading installation guides, cloning repositories, creating Python environments, debugging version conflicts, writing ad hoc shell scripts, and hoping they can recreate the setup six months later.
If AI-LSC can replace that with: select a template, review the execution plan, deploy, verify, export — then it has solved a real engineering problem.
20. The Architectural Vocabulary
These terms are stable. They will not change in v3.0 patches. They may evolve in v4.0, but only with explicit ADR amendment.
| Term | Definition |
|---|---|
| Capability | A named, validated unit of infrastructure that a machine possesses or does not. The central abstraction. |
| Template / Stack Recipe | A declarative document expressing infrastructure intent. Specifies capabilities and roles, not tools. |
| Resolver | The planning engine that maps intent to execution. Detects hardware, resolves roles, expands dependencies, produces plans. |
| Registry | The knowledge base of individual tools. Each entry maps a tool to its capabilities, installers, launchers, health probes, and exporters. |
| Role | A capability category with multiple possible providers (e.g., "Vector Database" → Qdrant, Chroma, Milvus). |
| Skill | A capability-derived behavior. Available when all required capabilities are present and healthy. |
| Pipeline | A directed graph of capability requirements. Consumes capabilities; does not name tools. |
| Connection | A declared relationship between two components in a Stack Recipe. Used for validation and diagnostics. |
| Stack Doctor | A diagnostic reasoning engine that traces problems across component boundaries using the connection graph. |
| Mission | A named, scoped deployment objective with a clear completion state. |
| Routine | A reusable infrastructure action (health check, pre-flight, cleanup). |
| Workspace | A purpose-based grouping of related infrastructure. |
| Frozen | An exact snapshot of a validated environment, pinned at every version. |
| Layer | One of 13 categories of AI infrastructure. Tools register into layers. Templates can reference layers. |
| Runtime | The execution target: native, Podman, Docker, LXC, cluster, or remote. |
Ankh of Jah marks the point where AI-LSC stopped being a Python application and became a platform architecture. Future releases build on this foundation. They do not revisit it.