πΊοΈ Yamlr Architectural Mental Model
"The Autonomous Manifest Control Plane"
This document outlines the core patterns and high-level logic that drive Yamlr's unique healing capabilities.
ποΈ 1. The Multi-Stage Pipeline
Every file processed by Yamlr travels through a 5-stage deterministic pipeline.
| Stage | Name | Component | Responsibility |
|---|---|---|---|
| 1 | Lexing | lexer.py | Non-destructive text-to-shard translation. Preserves comments. |
| 2 | Scanning | scanner.py | Identity Archeology: Mining "Sovereign Keys" from corrupt YAML. |
| 3 | Filtering | auditor.py | Schema validation and OPA logic gating. |
| 3.5 | Graph Build | resource_graph.py | Build cross-resource dependency graph from identities. |
| 4 | Structural | healer.py | Applying "Indentation Physics" to reconstruct blocks. |
| 5 | Export | exporter.py | Final serialization with Semantic DNA verification. |
π§ 2. Core Design Patterns
A. Sovereign Key Pattern
In corrupt YAML, identity markers (like kind or apiVersion) are often incorrectly indented. Yamlr identifies the "shallowest" candidates for these keys and promotes them to the root identity, allowing identification even when a standard parser fails with IndentationError.
B. Indentation Physics
Yamlr does not use a standard DOM-based parser for healing. Instead, it uses a Vertical Scope Stack. As shards stream through the engine, we track the "pressure" of indentation to determine logical parentage, much like how a human eye scans code.
C. Semantic DNA Integrity
To ensure "Yamlr Never Breaks Production," we calculate a hash of the manifest's semantic meaning before and after healing. If the healing logic accidentally changes a value or a nested structure (that wasn't part of a fix), the DNA hash will mismatch, and the write operation is blocked.
π 3. Safety Engine Layer (2026-02)
A. ResourceGraph (core/resource_graph.py + core/graph_edges.py)
Models cross-resource dependencies as a directed graph with 17 typed edge types:
βββββββββββββββ
β EdgeType β
βββββββββββββββ€
Config βββββββ CONFIGMAP β Volume/Env bindings
Config βββββββ SECRET β Volume/Env bindings
Config βββββββ PVC β Storage claims
Config βββββββ SA β ServiceAccount bindings
Traffic ββββββ SVC_SELECT β Service β Pod selectors
Traffic ββββββ INGRESS_BE β Ingress β Service backends
Traffic ββββββ GATEWAY_RT β HTTPRoute β Service backends
Owner ββββββββ OWNER_REF β Deployment β ReplicaSet chains
Scheduling βββ TOLERATES β Pod β Node tolerations
Scheduling βββ NODE_SELECT β Pod β Node affinity
Scheduling βββ PRIORITY_CL β Pod β PriorityClass
Scheduling βββ RUNTIME_CL β Pod β RuntimeClass
Security βββββ NET_POLICY β NetworkPolicy β Pod targets
Disruption βββ PDB_TARGET β PDB β Pod targets
CRD ββββββββββ CRD_INST β Custom resource β CRD definition
Init βββββββββ INIT_CONT β Init container β shared volumes
HPA ββββββββββ HPA_TARGET β HPA β scalable targets
βββββββββββββββ
Key methods: build(), dependents(), dependencies(), blast_radius(), critical_path().
Architecture: Edge-building logic is extracted to core/graph_edges.py (~270L) with an ALL_EDGE_BUILDERS registry. resource_graph.py (~380L) handles graph structure and traversal.
B. Risk Classifier (analyzers/risk_classifier.py)
Connective Tissue Warning: The RiskClassifier is tightly coupled with CrossResourceAnalyzer (which feeds it findings) and ResourceGraph (which it queries to determine blast radius and cascading impact scores).
Heuristic-based risk assessment engine with 15 risk categories and 14 scoring rules.
| Category | Scope | Risk Level |
|---|---|---|
| Traffic Routing Failure | Cluster | HIGH |
| Data Dependency Failure | Namespace | MEDIUM |
| API Deprecation Risk | Cluster | HIGH |
| Orphan Configuration Waste | Namespace | LOW |
| Security Policy Violation | Cluster | HIGH |
| Scheduling Configuration Risk | Namespace | MEDIUM |
| Network Isolation Risk | Cluster | HIGH |
| Disruption Tolerance Risk | Namespace | MEDIUM |
| Image Pull Failure Risk | Namespace | HIGH |
| CRD Dependency Risk | Namespace | MEDIUM |
| Replica Configuration Risk | Namespace | MEDIUM |
Scoring heuristics prioritize: NetworkPolicy/Image β always HIGH (security/availability). Orphans β always LOW. Everything else β context-dependent.
C. Version Intelligence (core/deprecations.py)
Extended deprecation database covering K8s 1.9 β 1.35 with:
- API-level deprecations (kind/group β replacement)
- Field-level deprecations (nested field paths)
check_field()for field-specific migration guidanceupgrade_risk_score()for version-aware risk quantification
π‘ 4. Modularity Guardrails
Yamlr enforces a strict < 500 line per module rule.
- Infrastructure: Managed by
YamlrManager(I/O, Catalog, Context). - Orchestration: Managed by
YamlrEngine(Facade, Multi-processing). - Specialization: Logic is offloaded to mixins and sub-modules (e.g.,
scanner_identity.py,processor.py). - CLI Help: Decomposed into
cli/help/help_scan.py,help_heal.py,help_toolbox.py,help_utils.py. - Graph Edges: Extracted to
core/graph_edges.pyfromresource_graph.py.
𧬠5. Cross-File Dependency Map
graph TD
CLI[cli/main.py] --> Engine[core/engine/base.py]
Engine --> Pipeline[core/pipeline.py]
Engine --> Processor[core/engine/processor.py]
Engine --> ResourceGraph[core/resource_graph.py]
%% Pipeline dependencies
Pipeline --> Lexer[core/lexer.py]
Pipeline --> Scanner[core/scanner.py]
Pipeline --> Auditor[core/auditor.py]
Pipeline --> Healer[healers/healer.py]
Pipeline --> Exporter[core/exporter.py]
%% The "Surgical" Connective Tissue
Processor --> CrossResource[analyzers/cross_resource.py]
CrossResource --> RiskClassifier[analyzers/risk_classifier.py]
RiskClassifier --> ResourceGraph
ResourceGraph --> GraphEdges[core/graph_edges.py]
%% Stage 10 Wiring
Pipeline --> RiskStage[core/stages/risk.py]
RiskStage --> RiskClassifier
RiskStage --> ShieldEngine[pro/shield.py]
%% Miscellaneous
Auditor --> Deprecations[core/deprecations.py]
Engine --> IO[core/io.py]
Engine --> ConfigManager[core/config.py]
CLI --> HelpFormatter[cli/help_formatter.py]
HelpFormatter --> HelpModules[cli/help/*]
π 6. The "Magic Moment" Factory
Yamlr provides a zero-config entry point for developers:
from yamlr.core.engine import YamlrEngine
engine = YamlrEngine.quick(".")
results = engine.batch_heal(".")
This factory leverages Industrial-Grade Fallbacks, including an embedded minimal catalog, to ensure a functional experience even in "asset-stripped" environments.