Why this page matters

This page explains how Commands and Scripts fits into the wider ZeroKernel execution model, what problem it is meant to solve, and what trade-off you are actually accepting when you use it in production firmware. The goal is not to treat Commands and Scripts as an isolated API call, but to understand where it sits inside bounded scheduling, queue discipline, fault visibility, and profile selection.

Read this topic as an operational contract. Start from the smallest working path, wire it into a lean profile first, and only expand into richer routing, diagnostics, or transport state after you can prove that the timing outcome is still worth the extra flash and RAM. That mindset is what keeps ZeroKernel useful on small boards instead of turning it into another bloated abstraction.

The safest pattern is always the same: define the runtime boundary, keep the hot path short, measure the effect with compare scripts, and only then scale complexity. The examples below are not filler; they show the smallest repeatable patterns you can lift into real firmware when you need clean integration instead of ad-hoc loops.

Three practical patterns

Full validation sequence

Use this when you need a credible regression pass before publishing numbers or changing docs.

    bash scripts/run_desktop_tests.sh
bash scripts/run_desktop_benchmark.sh --enforce-performance
bash scripts/run_resource_matrix.sh --enforce-budget

Hardware compare pass

Run a focused hardware compare instead of guessing whether a change helped or hurt.

    bash scripts/run_esp32_modules_compare.sh /dev/ttyUSB1
bash scripts/run_esp32_real_project_demo.sh /dev/ttyUSB1

Lean build guard

Lock the build into the intended profile before treating a benchmark or compare as authoritative.

    -DZEROKERNEL_PROFILE_LEAN_NET
-DZEROKERNEL_ENABLE_DIAGNOSTICS=0
-DZEROKERNEL_ENABLE_LEGACY_LABEL_API=0

What to verify while you use it

Validate timing before you validate aesthetics. A cleaner API is not a win if fast misses rise.
Prefer the smallest profile that still matches the workload, then add optional modules only when the measured payoff is obvious.
Keep callbacks and transport steps bounded so watchdog, panic flow, and queue limits remain meaningful.

Common mistakes that make results misleading

Do not copy a demo pattern into production firmware without measuring it on the real board and real build profile you plan to ship.
Do not read success counters without reading queue depth, timing, and workload label next to them.
Do not enable heavier diagnostics and compatibility flags in a lean target just because the defaults looked convenient.

Recommended working sequence

Start from the smallest valid path

Boot the runtime, register the minimum useful task set, and prove that the baseline timing is clean before adding optional layers.

Add one layer, then measure it

Introduce routing, diagnostics, or transport one layer at a time so the cost and payoff remain obvious.

Publish only repeatable results

Update docs, charts, or public claims only after the same workload survives the same validation path more than once.

Why the command layer matters

ZeroKernel makes concrete claims: deterministic timing, bounded queues, measurable tradeoffs, and repeatable field-style validation. Those claims only stay credible if the command layer is disciplined. The scripts in this repository are not decorative helpers; they are the repeatable path that turns source code changes into evidence. That is why this page documents them like operational tools instead of casual convenience scripts.

The correct mental model is to treat these scripts as part of the engineering process. Desktop scripts answer logic and performance questions cheaply. Resource scripts answer footprint questions. Hardware scripts answer "does this still behave on a board?" Each group solves a different problem, and mixing their roles is one of the easiest ways to get a false sense of safety from a passing command.

The command layer is also where release discipline starts. If your team cannot rerun the same script and get a comparable result, then benchmark tables, README claims, and release notes all become less trustworthy. That is why the examples below are written as repeatable recipes, not just as single isolated commands.

Core validation commands

Command	Purpose	Use it when
`bash scripts/run_desktop_tests.sh`	Runs unit and regression tests for the core runtime.	You touched scheduler logic, config, queues, or state transitions.
`bash scripts/run_desktop_benchmark.sh --enforce-performance`	Runs benchmark gates for string and fast paths.	You changed hot-path behavior or anything that can affect throughput.
`bash scripts/run_resource_matrix.sh --enforce-budget`	Compiles across target families and checks budget gates.	You added features, changed defaults, or plan to publish footprint numbers.
`bash scripts/run_workload_matrix.sh`	Builds the compare workload suite in one pass.	You want to confirm the example matrix still compiles coherently.

    # Minimal core sanity pass
bash scripts/run_desktop_tests.sh

# Performance-sensitive change
bash scripts/run_desktop_benchmark.sh --enforce-performance

# Release-grade preflight
bash scripts/run_desktop_tests.sh
bash scripts/run_desktop_benchmark.sh --enforce-performance
bash scripts/run_resource_matrix.sh --enforce-budget

ESP32 hardware compare scripts

Hardware compare scripts are where local engineering claims meet physical constraints. They compile, flash, capture output, and restore the board to a safe identity firmware after the test. That restore step matters. It keeps the device from being left behind in a stress loop, transport-heavy demo, or temporary benchmark mode after validation completes.

    bash scripts/run_esp32_modules_compare.sh /dev/ttyUSB1
bash scripts/run_esp32_env_monitor_compare.sh /dev/ttyUSB1
bash scripts/run_esp32_gateway_compare.sh /dev/ttyUSB1
bash scripts/run_esp32_industrial_compare.sh /dev/ttyUSB1
bash scripts/run_esp32_real_project_demo.sh /dev/ttyUSB1

Use these scripts when a desktop result is not enough. For example, queue discipline can look clean in a local benchmark and still behave differently after flashing because upload flow, serial timing, and board drivers add real-world friction that never appears in a desktop-only run.

    # Focused transport tuning
bash scripts/run_esp32_gateway_compare.sh /dev/ttyUSB1

# Publishable, more realistic pass
bash scripts/run_esp32_real_project_demo.sh /dev/ttyUSB1

What each command should prove

A passing command is not enough by itself. Each command should leave behind useful evidence. That evidence may be a timing number, a budget pass, a queue depth summary, or a board output snapshot. If a command succeeds but does not answer a real engineering question, it has not done its job yet.

Determinism: fast misses should stay at zero in tuned compares.
Queue health: pressure should be bounded, not grow without limit.
Tradeoff clarity: memory cost must have a measurable timing or throughput payoff.

    Good compare output should answer:
- Did sample_runs stay flat or improve?
- Did fast_miss remain zero?
- Did queue_max stay bounded?
- Did the added footprint pay for a real runtime gain?

Three practical command recipes

Most changes fit a small number of command patterns. Treat these as reusable runbooks. They save time, keep the output easier to interpret, and reduce the chance of over-testing one area while forgetting another.

    # Recipe 1: scheduler-only change
bash scripts/run_desktop_tests.sh
bash scripts/run_desktop_benchmark.sh --enforce-performance

    # Recipe 2: size-sensitive change
bash scripts/run_desktop_tests.sh
bash scripts/run_resource_matrix.sh --enforce-budget

    # Recipe 3: board-facing result
bash scripts/run_desktop_tests.sh
bash scripts/run_resource_matrix.sh --enforce-budget
bash scripts/run_esp32_modules_compare.sh /dev/ttyUSB1

Commands FAQ

Should I run everything every time?

No. Run the narrowest script that answers the change you made, then run the full matrix before publishing numbers.

Why are hardware compare scripts separate from desktop tests?

Desktop tests validate logic and local regressions. Hardware compares validate timing, flashing, serial capture, and board-visible behavior.

What if a benchmark passes but the board compare looks worse?

Trust the board for deployment decisions. Desktop results are useful, but the board is where real timing, transport, and toolchain behavior meet.

Should I publish numbers from one local run?

No. Repeat the run, note the board and profile, and publish only numbers you can reproduce with a clearly named workload.