37 KiB

Raw Blame History

ADR-001: The Capability Architecture

AI-LSC v3.0 — Ankh of Jah

This is the single architectural definition for AI-LSC. Every module, every template, every resolver path either implements something defined here or it does not belong.

Status

Accepted. Adopted as the foundational architecture for v3.0 (Ankh of Jah) and all subsequent releases. The agentic execution layer is deferred to v4.0.

1. Context

AI-LSC did not begin as an architecture. It began as a question:

"Can I stop manually juggling a dozen AI tools on a Linux machine?"

v1 answered: yes, with a monolithic script. v2 answered: yes, with a modular registry and layers. v3 answers a different question entirely:

"Can a system understand AI infrastructure well enough to deploy, validate, diagnose, and reproduce it — without the operator thinking about individual tools?"

The shift is from tool-first to system-first. Earlier development asked "how do we add support for X?" Current development asks "where does X belong in the architecture?" That is not a cosmetic change. It is a phase change.

Three releases revealed a consistent pattern: the same architectural verbs kept reappearing across unrelated features. Install, verify, configure, launch, monitor, export, diagnose, reproduce. Every tool needed them. Every stack needed them. Every container needed them. The repetition was not a failure to abstract — it was evidence of an abstraction waiting to be named.

This document names it.

2. The Foundational Object: Capability

Every system has one concept that, if removed, causes the entire structure to collapse. For AI-LSC, that concept is Capability.

A Capability is a named, validated unit of infrastructure that a machine either possesses or does not. It is not a tool. It is not a process. It is not a package. It is a statement about the machine.

"Inference"    — this machine can run LLM inference.
"Vector Store"  — this machine can store and query embeddings.
"Monitoring"    — this machine can observe its own services.
"GPU Compute"   — this machine has CUDA/cuDNN available.

Capabilities are discovered, not declared. A tool provides capabilities. A template requires capabilities. A pipeline consumes capabilities. A container exports capabilities. A dashboard reports capabilities. A skill extends capabilities. Monitoring validates capabilities.

Every subsystem points at Capability. No subsystem points at Tool directly except the Registry, which maps tools to the capabilities they provide.

This single inversion eliminates most of the coupling in the application:

Tool         ──provides──►  Capability  ◄──requires──  Template
                              ▲
Pipeline     ──consumes──────┘
                              ▲
Container    ──exports───────┘
                              ▲
Dashboard    ──reports───────┘
                              ▲
Skill        ──extends───────┘
                              ▲
Monitoring   ──validates─────┘

Swap Ollama for vLLM. Swap Grafana for another observability stack. Swap Qdrant for Milvus. Everything above the Registry layer does not notice. The capability model remains stable even when implementations evolve, technologies are replaced, or entirely new categories of AI software emerge.

3. The Architecture Pipeline

AI-LSC is not an installer. It is a pipeline from intent to infrastructure.

┌──────────────────────────────────────────────────────────────────┐
│                         USER INTENT                              │
│                                                                  │
│  "I want a Research Workstation"                                  │
│  "I want a RAG Server"                                            │
│  "I want a GPU Inference Cluster"                                │
│  "I want a Coding Assistant"                                      │
└────────────────────────┬─────────────────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────────────────┐
│                      TEMPLATE (Recipe)                             │
│                     Desired Architecture                          │
│                                                                  │
│  Research Workstation  │  RAG Appliance  │  Inference Node       │
└────────────────────────┬─────────────────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────────────────┐
│                       RESOLVER                                    │
│                    Infrastructure Planning                        │
│                                                                  │
│  • Detect hardware         • Detect OS                            │
│  • Detect installed sw     • Detect conflicts                     │
│  • Expand dependencies     • Select implementations                │
│  • Produce execution plan                                           │
└────────────────────────┬─────────────────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────────────────┐
│                       REGISTRY                                    │
│                   Individual Components                            │
│                                                                  │
│  Every tool knows: Install · Update · Verify · Launch             │
│                   Health · Configure · Container · Export         │
└────────────────────────┬─────────────────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────────────────┐
│                        RUNTIME                                    │
│                                                                  │
│  Native  ·  Podman  ·  Docker  ·  LXC  ·  Cluster  ·  Remote     │
└──────────────────────────────────────────────────────────────────┘

The Resolver is the brain. It is the only component that translates between the declarative world of templates and the imperative world of package managers, container runtimes, and service launchers. No other component performs this translation. This constraint ensures that adding a new runtime target (say, Kubernetes) requires changes only in the Registry (new tool entries) and Runtime (new executor), never in templates or pipelines.

4. Stack Recipes (Templates as Intent)

4.1 What a Template Is

A template is infrastructure intent, not an install script. It declares what the operator wants the machine to become. It does not duplicate install logic, configuration logic, or launch logic — the Registry already owns all of that.

The current template format is a flat list of tool IDs. This is functional but insufficient for the capability architecture. The evolved format — the Stack Recipe — declares capabilities, roles, connections, and startup semantics:

# Stack Recipe — evolved template format (v4.0 target)
stack:
  name: Claude Memory Assistant
  version: "1.0"
  maturity: official          # official | community | local | frozen

capabilities:
  required:
    - inference               # needs an LLM engine
    - vector_database         # needs embedding storage
    - relational_database     # needs structured storage
    - web_interface           # needs a browser-accessible UI
  optional:
    - monitoring
    - automation

components:
  inference:
    engine: ollama
    model: llama3
  memory:
    vectordb: qdrant
    embedding_model: nomic-embed-text
  database:
    engine: postgres
  ui:
    provider: open_webui

connections:
  - from: inference
    to: vector_database
    protocol: embedding
  - from: inference
    to: relational_database
    protocol: session_store
  - from: ui
    to: inference
    protocol: openai_compat

startup:
  order:
    1. relational_database
    2. vector_database
    3. inference
    4. ui
  health_wait:
    - relational_database    # UI waits until DB is accepting connections
    - vector_database
    - inference

health:
  checks:
    - capability: inference
      probe: GET /api/tags
    - capability: vector_database
      probe: GET /collections

4.2 What a Template Is Not

A template does not contain:

Installation commands (the Registry knows how to install)
File paths (the Resolver knows the layout)
Port assignments (conflict detection is automatic)
OS-specific logic (the Resolver handles this)
Dependency installation order beyond what startup.order declares

A template also does not hardcode implementations. It specifies roles:

components:
  vector_database:
    role: vector_store        # NOT "qdrant"

The Resolver maps vector_store to whatever provider is installed or available. On one machine that is Qdrant. On another it is Milvus. On a third the Resolver recommends Chroma. The template never changes.

4.3 Template Maturity

Templates have a maturity level that signals trust and intent:

Level	Meaning	Use Case
Official	Maintained by the AI-LSC project	Curated reference stacks
Community	Shared by users, reviewed	Experimentation, collaboration
Local	Created by the operator	Personal workflows, one-off stacks
Frozen	Exact snapshot of a validated environment	Reproducibility, CI/CD, audit

A Frozen template pins every version, every config hash, every capability signature. Deploying a Frozen template on a different machine produces a bit-for-bit equivalent environment. This is the mechanism for long-term reproducibility — not containerization alone, but declarative infrastructure with verified provenance.

5. Role-Based Resolution

The critical distinction between AI-LSC and every other "AI launcher" is that templates specify roles, not implementations.

A role is a capability category with multiple possible providers:

Role: Inference Engine
  Providers:  Ollama  ·  llama.cpp  ·  vLLM  ·  TensorRT-LLM  ·  LM Studio

Role: Vector Database
  Providers:  Qdrant  ·  Chroma  ·  Milvus  ·  Weaviate  ·  FAISS

Role: LLM Gateway
  Providers:  LiteLLM  ·  OpenRouter  ·  Local proxy

Role: Monitoring
  Providers:  Grafana + Prometheus  ·  Glances  ·  Netdata

Role: Agent Frontend
  Providers:  Open WebUI  ·  LibreChat  ·  AnythingLLM  ·  Continue

The Resolver performs role resolution in this order:

Already installed? Use what is present.
Compatible with hardware? Select the best fit (GPU → CUDA-aware provider).
Template preference? Honor explicit provider hints.
Fallback chain. Try each candidate in order.
Recommend. If nothing installs cleanly, report what is needed.

This means a single template shared between two machines can resolve to completely different toolsets:

"Research Workstation" template

Laptop (CPU-only):
  → llama.cpp (CPU inference)
  → LiteLLM (gateway)
  → Chroma (lightweight vector store)
  → Open WebUI (interface)

Desktop (RTX 4090):
  → Ollama (CUDA inference)
  → vLLM (high-throughput serving)
  → Qdrant (production vector store)
  → LibreChat (multi-provider interface)

Same template. Different reality. The Resolver is what makes that work.

6. Component Connections

Installing tools side-by-side is not an architecture. Understanding how they interact is.

The Stack Recipe format includes a connections section that declares relationships between components. These are not just documentation — they are inputs to the Stack Doctor (Section 12) and the Resolver's validation engine.

A connection declaration:

connections:
  - from: ui            # Open WebUI
    to: inference        # Ollama
    protocol: openai_compat    # Expects OpenAI-compatible API
  - from: ui
    to: vector_database
    protocol: embedding  # Needs embedding endpoint

The Resolver uses connections to:

Validate that protocols are compatible (OpenAI-compat ↔ OpenAI-compat).
Detect likely misconfigurations (OLLAMA_HOST=localhost when UI is remote).
Generate connection-specific health checks.
Produce diagnostic suggestions when connections fail.

This is dependency injection for infrastructure. The template declares the graph. The Resolver validates the graph. The Runtime instantiates the graph.

7. The 13-Layer Model

AI-LSC organizes all AI infrastructure into 13 layers. Each layer represents a category of capability. Tools register into one (sometimes two) layers. Templates reference layers instead of individual tools when expressing broad requirements.

Layer 1   Host Platform          — OS, kernel, filesystem, base packages
Layer 2   Development Env        — Python, Rust, Node.js, Go, build tools
Layer 3   GPU Runtime             — CUDA, cuDNN, ROCm, Vulkan compute
Layer 4   Inference Engines      — Ollama, llama.cpp, vLLM, TensorRT-LLM
Layer 5   Distributed Runtime     — Ray, Kubeflow, cluster schedulers
Layer 6   AI Endpoints            — LiteLLM, model routers, API gateways
Layer 7   Data & Knowledge       — PostgreSQL, MariaDB, data pipelines
Layer 8   Knowledge Management   — Qdrant, Chroma, Milvus, vector stores
Layer 9   Automation & Execution   — n8n, Airflow, task schedulers
Layer 10  Observability          — Prometheus, Grafana, Glances, logging
Layer 11  Intelligent Routing     — Fabric, Hermes, agent dispatchers
Layer 12  User Interfaces         — Open WebUI, LibreChat, AnythingLLM
Layer 13  Containers             — Podman, Docker, LXC, export targets

A template can express requirements by layer:

capabilities:
  layers:
    - Inference Engines       # Layer 4
    - AI Endpoints            # Layer 6
    - Knowledge Management   # Layer 8
    - User Interfaces         # Layer 12

The Resolver fills in everything else. If the template needs inference (Layer 4) and the host has no GPU (Layer 3), the Resolver knows to recommend CPU-only providers and skip CUDA-dependent tools automatically.

Stress Test

The 13-layer model must accommodate any AI project without forcing it. A non-exhaustive validation set:

Project	Natural Layer Fit
Open WebUI	12 (User Interfaces)
LiteLLM	6 (AI Endpoints)
Qdrant	8 (Knowledge Management)
Ollama	4 (Inference Engines)
vLLM	4 (Inference Engines)
ComfyUI	12 (User Interfaces)
Flowise	12 (User Interfaces)
n8n	9 (Automation & Execution)
Prometheus	10 (Observability)
Ray	5 (Distributed Runtime)
Langflow	12 (User Interfaces)
Chroma	8 (Knowledge Management)
Milvus	8 (Knowledge Management)
llama.cpp	4 (Inference Engines)
TensorRT-LLM	4 (Inference Engines)
OpenHands	12 (User Interfaces)
Aider	2 (Development Env)
Continue	2 (Development Env)
Kubeflow	5 (Distributed Runtime)
Kafka	7 (Data & Knowledge)

Every project in the validation set fits naturally into exactly one layer. None require special casing. The model appears to generalize well.

8. Skills as Derived Capabilities

Skills are not file lookups. They are capability queries.

The old model: "Does this Python file exist in the skills directory?" The new model: "Does this machine currently possess this capability?"

Skills derive from deployed, validated infrastructure:

Template: Research Workstation
    │
    ▼  Deployed
    │
    ▼  Verified
    │
    ▼  Registered as Capabilities
    │
    ▼  Skills become available:
    │
    ├── "Local RAG"        (has: inference + vector_store + ui)
    ├── "Python AI"        (has: development + inference)
    ├── "Vision"           (has: inference + multimodal_model)
    ├── "Speech"           (has: inference + whisper + tts)
    └── "Distributed Inference"  (has: inference + distributed_runtime)

A skill definition references capabilities, not tools:

skill:
  name: Local RAG
  requires:
    capabilities: [inference, vector_database, web_interface]
  optional:
    capabilities: [monitoring, relational_database]
  description: >
    End-to-end retrieval-augmented generation using local models.
    Available when the machine has an inference engine, a vector store,
    and a web interface — regardless of which specific tools provide them.

This means installing a new tool that provides an existing capability can silently unlock skills the operator never explicitly configured. Replace Qdrant with Milvus and every RAG skill still works, because the capability did not change — only the provider did.

9. Pipelines Consume Capabilities

A pipeline is a directed graph of capability requirements. It never names a tool. It names what it needs:

Pipeline: Document RAG

  [Source] → [Chunking] → [Embedding] → [Vector Store] → [Retriever] → [LLM] → [Output]

Each node is a capability. The Resolver maps each node to a tool at runtime:

Embedding:
  → nomic-embed-text (via Ollama)
  or
  → bge-small (via llama.cpp)

Vector Store:
  → Qdrant
  or
  → Chroma

LLM:
  → Ollama (llama3)
  or
  → vLLM (deepseek-coder-33b)

The pipeline graph never changes when implementations change. This is what makes pipelines portable across machines, containers, and clusters.

10. Container Export as Capability Export

A container image is not a bag of tools. It is a frozen capability set.

When an operator exports a Research Workstation to Podman, the exported image carries a capability manifest alongside the filesystem layers:

Research_Workstation_v1.0

  Capabilities:
    ✓ Inference (Ollama, llama3)
    ✓ GPU Compute (CUDA 12.4, cuDNN 9.1)
    ✓ Vector Database (Qdrant)
    ✓ LLM Gateway (LiteLLM)
    ✓ Web Interface (Open WebUI)
    ✓ Monitoring (Prometheus + Grafana)
    ✓ Relational Database (PostgreSQL)

  Stack Recipe: embedded (frozen)
  Template: Research Workstation v1.0
  Exported: 2026-06-28
  Architecture: x86_64

When another machine imports this image, AI-LSC reads the manifest and immediately knows what the container provides — no scanning, no probing, no guessing. The capabilities are declared, trusted, and verified.

Export targets are format-agnostic:

Recipe → Resolver → Generate Deployment
                    ├── Podman Quadlet
                    ├── Docker Compose
                    ├── LXC Config
                    └── Kubernetes YAML (future)

The recipe never changes. Only the exporter changes.

11. Dashboards Report Capability Health

The dashboard does not display process status. It displays infrastructure health.

┌──────────────────────────────────────────────────────┐
│  Research Workstation                     ████████ 92%│
│                                                       │
│  Host Platform          ✓                             │
│  Development Env        ✓                             │
│  GPU Runtime            ⚠ CUDA Update Available       │
│  Inference Engines      ✓  Ollama · llama3            │
│  AI Endpoints           ✓  LiteLLM :4000               │
│  Data & Knowledge       ✓  PostgreSQL :5432            │
│  Knowledge Management   ✓  Qdrant :6333                │
│  Automation             —                             │
│  Observability          ✓  Grafana · Prometheus         │
│  Intelligent Routing    ✓  Fabric                      │
│  User Interfaces        ✓  Open WebUI :8080            │
│  Containers             2 specialist images            │
│                                                       │
│  Templates: 7 installed    Skills: 12 available        │
└──────────────────────────────────────────────────────┘

Each row is a capability, not a tool. The status reflects whether the machine possesses that capability in a healthy state, regardless of which tool provides it. If the operator swaps Grafana for Netdata, the Observability row still shows the same status — because the capability did not change.

12. Stack Doctor

The Stack Doctor is a reasoning engine, not a log viewer. It understands relationships between components and can diagnose problems that span multiple tools.

Example diagnosis:

DIAGNOSIS: Open WebUI cannot reach Ollama

REASON: OLLAMA_HOST is set to localhost (127.0.0.1)
        but Open WebUI is configured to connect to port 11434
        on all interfaces. Connection is refused.

RECOMMENDATION:
  Option A: Set OLLAMA_HOST=0.0.0.0 in Ollama environment
  Option B: Bind Open WebUI to localhost only
  Option C: Route through LiteLLM proxy

Example conflict detection:

DIAGNOSIS: Port conflict detected

  LiteLLM wants port 4000  ✓ (available)
  vLLM wants port 8000      ✗ (occupied by TensorRT-LLM)

RECOMMENDATION:
  Move LiteLLM to port 4001
  or
  Disable TensorRT-LLM if not needed

The Stack Doctor uses the connection graph from the Stack Recipe to trace problems across component boundaries. It does not just check if a process is running — it checks if the capability chain is intact from end to end.

13. Operator Workflows

13.1 Missions

Complex deployments are presented as Missions, not wizards. A Mission is a named, scoped objective with a clear completion state:

┌──────────────────────────────────────────────────────┐
│  MISSION: Build Coding Assistant                      │
│                                                       │
│  Estimated effort: 8 minutes                           │
│  Status: Planning...                                   │
│                                                       │
│  [✓] Validate host platform                            │
│  [✓] Detect installed capabilities                     │
│  [→] Resolve missing dependencies                      │
│  [ ] Install Python (Layer 2)                          │
│  [ ] Install Ollama (Layer 4)                          │
│  [ ] Install LiteLLM (Layer 6)                         │
│  [ ] Install Open WebUI (Layer 12)                     │
│  [ ] Configure connections                             │
│  [ ] Verify health                                     │
│  [ ] Export ready                                      │
└──────────────────────────────────────────────────────┘

13.2 Routines

Routines are reusable infrastructure actions, not application macros:

Routine	Actions
Morning Check	Verify all services, restart unhealthy, check updates, check GPU, check disk
Pre-Inference	GPU memory, temperature, ports, models, KV cache, endpoint ready
Before Export	Verify services, verify configs, clean logs, freeze versions, generate manifest
Before Commit	Lint, test, validate registry, validate templates, schema check

One button. Comprehensive validation.

13.3 Next Best Action

AI-LSC suggests the operator's next step based on current state:

Good morning.
  ✓ GPU healthy
  ✓ Ollama healthy
  ⚠ Open WebUI update available (v0.3.12 → v0.3.14)
  ⚠ Research Workstation template has 1 missing dependency

Suggested: Verify Research Workstation

This is not AI. It is deterministic inference over the capability graph. The system knows what is installed, what is healthy, what is outdated, and what templates require. The recommendation follows directly.

13.4 Activity Timeline

Every infrastructure action is recorded with a timestamp:

09:13  Installed LiteLLM
09:15  Verified CUDA (driver 550.54, CUDA 12.4)
09:16  Generated template: Research Workstation
09:20  Exported Podman image: research_ws_v1.0
09:27  Health check passed (13/13 capabilities)

Timelines are queryable, filterable, and exportable. They provide audit trail and operational memory.

13.5 Workspaces

Workspaces group related infrastructure by purpose, not by tool:

Research   → inference + vector_db + ui + monitoring
Coding     → development + inference + endpoints + ui
RAG        → inference + vector_db + relational_db + ui
Cluster    → distributed + inference + monitoring + containers

Click a workspace. Everything related appears. One context for one purpose.

14. Adaptive Templates

A single template adapts to the host hardware, installed software, and available runtimes. The Resolver selects implementations based on constraints, not preferences.

"Research Workstation" on different hardware:

Laptop (CPU, 16GB RAM):
  → llama.cpp (quantized, CPU inference)
  → Chroma (in-process vector store, minimal memory)
  → LiteLLM (lightweight gateway)
  → Glances (lightweight monitoring)
  → Open WebUI (browser interface)

Desktop (RTX 4090, 64GB RAM):
  → Ollama (CUDA-accelerated inference)
  → Qdrant (production vector store with GPU-accelerated HNSW)
  → LiteLLM + vLLM (dual gateway: fast + thorough)
  → Prometheus + Grafana (full monitoring stack)
  → LibreChat (multi-provider interface)

Server (Dual MI300X, 256GB RAM):
  → SGLang (ROCm-optimized inference)
  → Milvus (distributed vector store)
  → LiteLLM (cluster gateway)
  → Prometheus + Grafana + AlertManager (production monitoring)
  → Open WebUI (load-balanced)

Same template. Same intent. Different reality. The Resolver is what makes the template portable.

15. Rationale

Why Capability as the central abstraction?

Because tools are ephemeral. The AI landscape changes monthly. New inference engines appear. Old ones are abandoned. Monitoring stacks get replaced. Vector databases get acquired and deprecated.

But the capabilities those tools provide are remarkably stable. "The machine can run LLM inference" has been true since 2023 and will be true in 2030. The implementation changes. The capability does not.

Building around capabilities means AI-LSC's architecture decays at the rate of the AI industry's conceptual evolution, not its tool churn. Conceptual evolution is orders of magnitude slower.

Why not just use Terraform / Kubernetes?

Because those tools solve a different problem. Terraform manages cloud infrastructure declaratively. Kubernetes orchestrates containers at scale. Neither understands that "install Qdrant" implies "the machine now has vector database capability" — nor should they. That is AI-LSC's domain.

AI-LSC is specifically designed for the local AI operator who needs to assemble, validate, and reproduce AI stacks on single machines or small clusters. It fills the gap between "install scripts" and "cloud orchestration."

Why role-based resolution instead of tool-specific templates?

Because a template that hardcodes Qdrant cannot run on a machine that only has Milvus. A template that hardcodes Ollama cannot leverage an existing vLLM installation. Role-based resolution makes templates portable, shareable, and future-proof without requiring the template author to anticipate every possible provider.

16. Consequences

Positive

Tool swaps are zero-cost above the Registry. Replacing a provider requires only a new Registry entry with the same capability mapping. Templates, pipelines, skills, and dashboards are unaffected.
Templates are shareable across heterogeneous hardware. The same recipe produces appropriate deployments on laptops, desktops, and servers.
New capabilities can be added without modifying existing templates. Adding a "Speech-to-Text" capability does not require touching any Research Workstation template.
Container exports carry semantic meaning, not just filesystem state. Importing a container immediately reveals its capabilities.
Diagnostics can reason about relationships, not just individual process health.

Neutral

The Resolver is the most complex component. It must understand hardware detection, OS differences, dependency graphs, conflict resolution, and provider selection. This is acceptable because the Resolver is a single, well-bounded component.
The capability vocabulary must be curated. New capabilities require consensus on naming, boundaries, and provider criteria. This is a governance concern, not a technical one.

Risks

Over-abstraction. If the capability vocabulary is too coarse ("compute"), it loses discriminating power. If too fine ("qdrant-hnsw- gpu"), it reverts to tool-specific coupling. The granularity must be calibrated through real-world use.
Resolver complexity. A naive Resolver that tries all combinations is NP-hard. The Resolver must use heuristics, caching, and constraint propagation to remain fast.
Capability drift. As the AI ecosystem evolves, capabilities may split or merge. "Inference" might split into "Text Inference" and "Multimodal Inference." The architecture must handle capability evolution without breaking existing templates.

17. Architecture Completeness

Current state of implementation (v3.0 Ankh of Jah):

Registry (tool metadata, 115 tools)          ████████████░ 95%
Templates (stack recipes, 4 templates)        ██████░░░░░░ 55%
Resolver (dependency expansion, planning)    ███░░░░░░░░░ 30%
Installer (native, git, npm, pip)            ████████████░ 95%
Verification (install checks, health probes) ██████████░░░ 85%
Health (service status, GPU monitoring)       ███████░░░░░ 65%
Export (Podman, Docker, LXC configs)         ████████░░░░ 80%
Monitoring (glances integration, Prometheus) █████░░░░░░░ 50%
Skills (capability-derived skills)            ███░░░░░░░░░ 25%
Pipelines (capability graph execution)      ██░░░░░░░░░░ 20%
Dashboards (capability health display)       ████░░░░░░░░ 35%
Stack Doctor (diagnostic reasoning)          ██░░░░░░░░░░ 15%
Missions (guided deployment flows)            █░░░░░░░░░░░ 10%
Workspaces (purpose-based grouping)           ███░░░░░░░░░ 25%
Activity Timeline                            ██░░░░░░░░░░ 20%
Next Best Action                             █░░░░░░░░░░░ 10%
Documentation (this ADR, README, guides)     ██████░░░░░░ 55%
Tests                                        ██░░░░░░░░░░ 20%

The pattern is clear: the foundation (Registry, Installer, Verification) is strong. The intelligence layer (Resolver, Stack Doctor, Missions) is where the next investment goes. The UI layer (Dashboards, Workspaces, Timeline) follows the intelligence layer.

18. Feature Policy (Ankh of Jah Stabilization)

v3.0 enters a stabilization phase. Feature velocity decreases; stability velocity increases.

Allowed

Bug fixes
Registry additions (new tool metadata, new providers)
New templates (stack recipes)
Installer verification and hardening
UI polish and usability improvements
Documentation
Tests
Capability vocabulary refinement
Resolver heuristic improvements

Not Allowed

New architectural concepts
New runtime systems
Major UI redesigns
New registry formats (schema changes)
Agent execution (deferred to v4.0)
Cluster orchestration (deferred to v4.0)
Remote node management (deferred to v4.0)

v4.0 Scope (Deferred)

The agentic execution layer — where an LLM operates AI-LSC through function-calling, using the agents/ bridge to start/stop services, pull models, inject skills, and diagnose issues through natural language. This is architecturally designed (agents/ package exists, tool_bridge and ollama_tools are implemented, Redis pub/sub infrastructure is in place) but intentionally not activated in v3.0.

19. Project Philosophy

AI-LSC is a native-first, metadata-driven infrastructure manager for local AI systems. It treats AI software as reusable infrastructure rather than isolated applications, enabling reproducible deployments, validation, monitoring, and export of complete AI environments.

This single paragraph is the decision filter for every proposed feature. If a feature supports this philosophy — making AI infrastructure easier to deploy, validate, reproduce, and understand — it belongs. If it does not, it does not.

AI-LSC's biggest competitor is not another AI launcher. It is the manual process that most developers still follow: reading installation guides, cloning repositories, creating Python environments, debugging version conflicts, writing ad hoc shell scripts, and hoping they can recreate the setup six months later.

If AI-LSC can replace that with: select a template, review the execution plan, deploy, verify, export — then it has solved a real engineering problem.

20. The Architectural Vocabulary

These terms are stable. They will not change in v3.0 patches. They may evolve in v4.0, but only with explicit ADR amendment.

Term	Definition
Capability	A named, validated unit of infrastructure that a machine possesses or does not. The central abstraction.
Template / Stack Recipe	A declarative document expressing infrastructure intent. Specifies capabilities and roles, not tools.
Resolver	The planning engine that maps intent to execution. Detects hardware, resolves roles, expands dependencies, produces plans.
Registry	The knowledge base of individual tools. Each entry maps a tool to its capabilities, installers, launchers, health probes, and exporters.
Role	A capability category with multiple possible providers (e.g., "Vector Database" → Qdrant, Chroma, Milvus).
Skill	A capability-derived behavior. Available when all required capabilities are present and healthy.
Pipeline	A directed graph of capability requirements. Consumes capabilities; does not name tools.
Connection	A declared relationship between two components in a Stack Recipe. Used for validation and diagnostics.
Stack Doctor	A diagnostic reasoning engine that traces problems across component boundaries using the connection graph.
Mission	A named, scoped deployment objective with a clear completion state.
Routine	A reusable infrastructure action (health check, pre-flight, cleanup).
Workspace	A purpose-based grouping of related infrastructure.
Frozen	An exact snapshot of a validated environment, pinned at every version.
Layer	One of 13 categories of AI infrastructure. Tools register into layers. Templates can reference layers.
Runtime	The execution target: native, Podman, Docker, LXC, cluster, or remote.

Ankh of Jah marks the point where AI-LSC stopped being a Python application and became a platform architecture. Future releases build on this foundation. They do not revisit it.

37 KiB Raw Blame History