Last Updated: March 2026
Verified by Yamlr Safety Engine

πŸ—ΊοΈ Yamlr Architectural Mental Model

"The Autonomous Manifest Control Plane"

This document outlines the core patterns and high-level logic that drive Yamlr's unique healing capabilities.


πŸ—οΈ 1. The Multi-Stage Pipeline

Every file processed by Yamlr travels through a 5-stage deterministic pipeline.

StageNameComponentResponsibility
1Lexinglexer.pyNon-destructive text-to-shard translation. Preserves comments.
2Scanningscanner.pyIdentity Archeology: Mining "Sovereign Keys" from corrupt YAML.
3Filteringauditor.pySchema validation and OPA logic gating.
3.5Graph Buildresource_graph.pyBuild cross-resource dependency graph from identities.
4Structuralhealer.pyApplying "Indentation Physics" to reconstruct blocks.
5Exportexporter.pyFinal serialization with Semantic DNA verification.

🧠 2. Core Design Patterns

A. Sovereign Key Pattern

In corrupt YAML, identity markers (like kind or apiVersion) are often incorrectly indented. Yamlr identifies the "shallowest" candidates for these keys and promotes them to the root identity, allowing identification even when a standard parser fails with IndentationError.

B. Indentation Physics

Yamlr does not use a standard DOM-based parser for healing. Instead, it uses a Vertical Scope Stack. As shards stream through the engine, we track the "pressure" of indentation to determine logical parentage, much like how a human eye scans code.

C. Semantic DNA Integrity

To ensure "Yamlr Never Breaks Production," we calculate a hash of the manifest's semantic meaning before and after healing. If the healing logic accidentally changes a value or a nested structure (that wasn't part of a fix), the DNA hash will mismatch, and the write operation is blocked.


πŸ”— 3. Safety Engine Layer (2026-02)

A. ResourceGraph (core/resource_graph.py + core/graph_edges.py)

Models cross-resource dependencies as a directed graph with 17 typed edge types:

               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
               β”‚  EdgeType   β”‚
               β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  Config ──────│ CONFIGMAP   β”‚  Volume/Env bindings
  Config ──────│ SECRET      β”‚  Volume/Env bindings
  Config ──────│ PVC         β”‚  Storage claims
  Config ──────│ SA          β”‚  ServiceAccount bindings
  Traffic ─────│ SVC_SELECT  β”‚  Service β†’ Pod selectors
  Traffic ─────│ INGRESS_BE  β”‚  Ingress β†’ Service backends
  Traffic ─────│ GATEWAY_RT  β”‚  HTTPRoute β†’ Service backends
  Owner ───────│ OWNER_REF   β”‚  Deployment β†’ ReplicaSet chains
  Scheduling ──│ TOLERATES   β”‚  Pod ↔ Node tolerations
  Scheduling ──│ NODE_SELECT β”‚  Pod β†’ Node affinity
  Scheduling ──│ PRIORITY_CL β”‚  Pod β†’ PriorityClass
  Scheduling ──│ RUNTIME_CL  β”‚  Pod β†’ RuntimeClass
  Security ────│ NET_POLICY  β”‚  NetworkPolicy β†’ Pod targets
  Disruption ──│ PDB_TARGET  β”‚  PDB β†’ Pod targets
  CRD ─────────│ CRD_INST   β”‚  Custom resource β†’ CRD definition
  Init ────────│ INIT_CONT  β”‚  Init container β†’ shared volumes
  HPA ─────────│ HPA_TARGET β”‚  HPA β†’ scalable targets
               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key methods: build(), dependents(), dependencies(), blast_radius(), critical_path().

Architecture: Edge-building logic is extracted to core/graph_edges.py (~270L) with an ALL_EDGE_BUILDERS registry. resource_graph.py (~380L) handles graph structure and traversal.

B. Risk Classifier (analyzers/risk_classifier.py)

Connective Tissue Warning: The RiskClassifier is tightly coupled with CrossResourceAnalyzer (which feeds it findings) and ResourceGraph (which it queries to determine blast radius and cascading impact scores).

Heuristic-based risk assessment engine with 15 risk categories and 14 scoring rules.

CategoryScopeRisk Level
Traffic Routing FailureClusterHIGH
Data Dependency FailureNamespaceMEDIUM
API Deprecation RiskClusterHIGH
Orphan Configuration WasteNamespaceLOW
Security Policy ViolationClusterHIGH
Scheduling Configuration RiskNamespaceMEDIUM
Network Isolation RiskClusterHIGH
Disruption Tolerance RiskNamespaceMEDIUM
Image Pull Failure RiskNamespaceHIGH
CRD Dependency RiskNamespaceMEDIUM
Replica Configuration RiskNamespaceMEDIUM

Scoring heuristics prioritize: NetworkPolicy/Image β†’ always HIGH (security/availability). Orphans β†’ always LOW. Everything else β†’ context-dependent.

C. Version Intelligence (core/deprecations.py)

Extended deprecation database covering K8s 1.9 β†’ 1.35 with:

  • API-level deprecations (kind/group β†’ replacement)
  • Field-level deprecations (nested field paths)
  • check_field() for field-specific migration guidance
  • upgrade_risk_score() for version-aware risk quantification

πŸ›‘ 4. Modularity Guardrails

Yamlr enforces a strict < 500 line per module rule.

  • Infrastructure: Managed by YamlrManager (I/O, Catalog, Context).
  • Orchestration: Managed by YamlrEngine (Facade, Multi-processing).
  • Specialization: Logic is offloaded to mixins and sub-modules (e.g., scanner_identity.py, processor.py).
  • CLI Help: Decomposed into cli/help/help_scan.py, help_heal.py, help_toolbox.py, help_utils.py.
  • Graph Edges: Extracted to core/graph_edges.py from resource_graph.py.

🧬 5. Cross-File Dependency Map

graph TD
    CLI[cli/main.py] --> Engine[core/engine/base.py]
    Engine --> Pipeline[core/pipeline.py]
    Engine --> Processor[core/engine/processor.py]
    Engine --> ResourceGraph[core/resource_graph.py]
    
    %% Pipeline dependencies
    Pipeline --> Lexer[core/lexer.py]
    Pipeline --> Scanner[core/scanner.py]
    Pipeline --> Auditor[core/auditor.py]
    Pipeline --> Healer[healers/healer.py]
    Pipeline --> Exporter[core/exporter.py]
    
    %% The "Surgical" Connective Tissue
    Processor --> CrossResource[analyzers/cross_resource.py]
    CrossResource --> RiskClassifier[analyzers/risk_classifier.py]
    RiskClassifier --> ResourceGraph
    ResourceGraph --> GraphEdges[core/graph_edges.py]
    
    %% Stage 10 Wiring
    Pipeline --> RiskStage[core/stages/risk.py]
    RiskStage --> RiskClassifier
    RiskStage --> ShieldEngine[pro/shield.py]
    
    %% Miscellaneous
    Auditor --> Deprecations[core/deprecations.py]
    Engine --> IO[core/io.py]
    Engine --> ConfigManager[core/config.py]
    CLI --> HelpFormatter[cli/help_formatter.py]
    HelpFormatter --> HelpModules[cli/help/*]

πŸš€ 6. The "Magic Moment" Factory

Yamlr provides a zero-config entry point for developers:

from yamlr.core.engine import YamlrEngine
engine = YamlrEngine.quick(".")
results = engine.batch_heal(".")

This factory leverages Industrial-Grade Fallbacks, including an embedded minimal catalog, to ensure a functional experience even in "asset-stripped" environments.