Why this page matters

This page explains how Validation Flow fits into the wider ZeroKernel execution model, what problem it is meant to solve, and what trade-off you are actually accepting when you use it in production firmware. The goal is not to treat Validation Flow as an isolated API call, but to understand where it sits inside bounded scheduling, queue discipline, fault visibility, and profile selection.

Read this topic as an operational contract. Start from the smallest working path, wire it into a lean profile first, and only expand into richer routing, diagnostics, or transport state after you can prove that the timing outcome is still worth the extra flash and RAM. That mindset is what keeps ZeroKernel useful on small boards instead of turning it into another bloated abstraction.

The safest pattern is always the same: define the runtime boundary, keep the hot path short, measure the effect with compare scripts, and only then scale complexity. The examples below are not filler; they show the smallest repeatable patterns you can lift into real firmware when you need clean integration instead of ad-hoc loops.

Three practical patterns

Full validation sequence

Use this when you need a credible regression pass before publishing numbers or changing docs.

    bash scripts/run_desktop_tests.sh
bash scripts/run_desktop_benchmark.sh --enforce-performance
bash scripts/run_resource_matrix.sh --enforce-budget

Hardware compare pass

Run a focused hardware compare instead of guessing whether a change helped or hurt.

    bash scripts/run_esp32_modules_compare.sh /dev/ttyUSB1
bash scripts/run_esp32_real_project_demo.sh /dev/ttyUSB1

Lean build guard

Lock the build into the intended profile before treating a benchmark or compare as authoritative.

    -DZEROKERNEL_PROFILE_LEAN_NET
-DZEROKERNEL_ENABLE_DIAGNOSTICS=0
-DZEROKERNEL_ENABLE_LEGACY_LABEL_API=0

What to verify while you use it

Validate timing before you validate aesthetics. A cleaner API is not a win if fast misses rise.
Prefer the smallest profile that still matches the workload, then add optional modules only when the measured payoff is obvious.
Keep callbacks and transport steps bounded so watchdog, panic flow, and queue limits remain meaningful.

Common mistakes that make results misleading

Do not copy a demo pattern into production firmware without measuring it on the real board and real build profile you plan to ship.
Do not read success counters without reading queue depth, timing, and workload label next to them.
Do not enable heavier diagnostics and compatibility flags in a lean target just because the defaults looked convenient.

Recommended working sequence

Start from the smallest valid path

Boot the runtime, register the minimum useful task set, and prove that the baseline timing is clean before adding optional layers.

Add one layer, then measure it

Introduce routing, diagnostics, or transport one layer at a time so the cost and payoff remain obvious.

Publish only repeatable results

Update docs, charts, or public claims only after the same workload survives the same validation path more than once.

Why the validation flow is strict

ZeroKernel claims deterministic timing, bounded queues, and measurable tradeoffs. Those claims only remain credible if the validation path is repeatable. That is why the validation flow is documented as a real process rather than as a loose suggestion. A result that cannot be reproduced with the same steps is not a reliable basis for docs, release notes, or product positioning.

The correct order is deliberate: start with local logic, confirm the footprint, then move to hardware. This keeps expensive board time focused on problems that only boards can reveal. It also prevents a misleading situation where a board appears stable but the desktop benchmark or compile-time budget already regressed in a way no one noticed because the wrong script was run first.

Another reason this matters is communication. A benchmark table is only useful if someone else can rerun the same flow later and get a result that is meaningfully comparable. The sequence below is the documented way to make those numbers defensible.

Recommended sequence

Run desktop tests and benchmark gates.
Run the resource matrix and confirm budgets still pass.
Run one or more hardware compare scripts on the target board.
Only update docs, benchmark tables, or marketing claims after the numbers are repeatable.

    bash scripts/run_desktop_tests.sh
bash scripts/run_desktop_benchmark.sh --enforce-performance
bash scripts/run_resource_matrix.sh --enforce-budget
bash scripts/run_esp32_real_project_demo.sh /dev/ttyUSB1

What to log every time

Fast misses and lag values for deterministic claims.
Queue depth and success rate for transport claims.
Build profile and board target for every published number.

The difference between a casual experiment and a useful engineering record is usually the log quality. If a result only says "it looked faster," it is almost useless later. If it records board, profile, timing, queue depth, and success rate, you can compare it against future changes and understand which tradeoff moved.

What not to publish too early

Single-run synthetic output without a clear workload label.
Network module results presented as stable if they are still BETA.
Resource costs without the corresponding runtime payoff.

Sparse but honest data is better than impressive but misleading data. It is acceptable to call a result synthetic, BETA, or still under validation. It is not acceptable to present it as if it were already a stable production finding. The labeling discipline is part of the engineering quality, not just a presentation detail.

Three practical validation patterns

    Pattern 1: core runtime change
Run desktop tests and benchmark gates first.
Use hardware only when the change is expected to affect board-visible timing.

    Pattern 2: footprint-sensitive release
Run the resource matrix and compare the new overhead against the runtime payoff before updating docs.

    Pattern 3: publishable board result
Repeat the same hardware run, note the board and profile, and only publish the number once the workload is clearly labeled.

Validation FAQ

Do I need real WiFi to start validating?

No. Start with simulated or bounded workloads so you can isolate runtime behavior before external network variability enters the picture.

When is a benchmark worth adding to the docs?

When it is repeatable, tied to a defined workload, and explains the tradeoff clearly.

What if a number looks impressive but only on one run?

Do not publish it yet. Repeat it, keep the workload label explicit, and make sure the same result survives the documented flow.

Are synthetic tests still useful?

Yes. They are useful for isolating scheduler and queue behavior, as long as they are labeled honestly and not presented as field-proofed production evidence.