Docs

State Machine and Panic Flow

ZeroKernel is more than a callback list. This page explains the explicit runtime state model and how the panic path is used to stop silent failure from becoming invisible.

Why this page matters

This page explains how State Machine and Panic Flow fits into the wider ZeroKernel execution model, what problem it is meant to solve, and what trade-off you are actually accepting when you use it in production firmware. The goal is not to treat State Machine and Panic Flow as an isolated API call, but to understand where it sits inside bounded scheduling, queue discipline, fault visibility, and profile selection.

Read this topic as an operational contract. Start from the smallest working path, wire it into a lean profile first, and only expand into richer routing, diagnostics, or transport state after you can prove that the timing outcome is still worth the extra flash and RAM. That mindset is what keeps ZeroKernel useful on small boards instead of turning it into another bloated abstraction.

The safest pattern is always the same: define the runtime boundary, keep the hot path short, measure the effect with compare scripts, and only then scale complexity. The examples below are not filler; they show the smallest repeatable patterns you can lift into real firmware when you need clean integration instead of ad-hoc loops.

Three practical patterns

Core cadence pattern

Use one bounded task for the hot path, then let the scheduler keep the phase aligned over time.

C++
    ZeroKernel.begin(boardMillis);
ZeroKernel.addTask("Fast", fastTask, 10, 0, true);
ZeroKernel.tick();
  
Deferred work pattern

Move non-critical routing and transport out of the immediate task body so fast paths stay predictable.

C++
    const auto key = ZeroKernel.makeTopicKey("telemetry.sample");
ZeroKernel.publishDeferredFast(key, sampleValue);
ZeroKernel.flushEvents();
  
Runtime visibility pattern

Read the timing report and stats together so you can prove the cost of each abstraction layer.

C++
    const auto stats = ZeroKernel.getStats();
const auto timing = ZeroKernel.getTimingReport();
Serial.println(timing.maxTickMs);
  

What to verify while you use it

  • Validate timing before you validate aesthetics. A cleaner API is not a win if fast misses rise.
  • Prefer the smallest profile that still matches the workload, then add optional modules only when the measured payoff is obvious.
  • Keep callbacks and transport steps bounded so watchdog, panic flow, and queue limits remain meaningful.

Common mistakes that make results misleading

  • Do not copy a demo pattern into production firmware without measuring it on the real board and real build profile you plan to ship.
  • Do not read success counters without reading queue depth, timing, and workload label next to them.
  • Do not enable heavier diagnostics and compatibility flags in a lean target just because the defaults looked convenient.

Recommended working sequence

Start from the smallest valid path

Boot the runtime, register the minimum useful task set, and prove that the baseline timing is clean before adding optional layers.

Add one layer, then measure it

Introduce routing, diagnostics, or transport one layer at a time so the cost and payoff remain obvious.

Publish only repeatable results

Update docs, charts, or public claims only after the same workload survives the same validation path more than once.

Kernel state model

Text
    BOOT -> NORMAL -> DEGRADED -> SAFE_MODE -> RECOVERY
                             \-> PANIC
  

Not every firmware uses every state, but the model gives the runtime a consistent contract for escalation and recovery.

Panic path

triggerPanic() is the explicit hard stop for conditions that should not be silently tolerated. The panic path is useful for unrecoverable runtime corruption, repeated critical task failure, or test-mode verification.

Safe mode vs panic

  • Safe mode keeps the firmware alive with reduced behavior.
  • Panic prioritizes visibility and containment over feature continuity.
  • Use safe mode first if the system can still provide value in a reduced state.

State and panic FAQ

Should production firmware ever call triggerPanic?

Yes, but only for explicitly defined unrecoverable states. Panic should be intentional, not a substitute for normal control flow.

Is degraded mode mandatory?

No. It exists so firmware can respond proportionally instead of jumping straight from healthy to panic.

What is the safest way to validate this page on real hardware?

Start from the leanest profile that still matches the topic, run the narrowest compare script for this behavior, and only then move to heavier mixed workloads. Do not jump straight to a fully loaded build if the base timing is not yet proven.