doctor
doctor
¶
Container health check protocol and sandbox-level diagnostics.
Defines the shared DoctorCheck / CheckVerdict protocol
used across the terok package chain (sandbox → agent → terok). Each
package contributes domain-specific checks; the top-level terok sickbay
orchestrates execution inside containers via podman exec.
Sandbox-level checks verify host-side service reachability from within a container (vault token broker TCP, SSH signer TCP) and shield firewall state.
CheckVerdict(severity, detail, fixable=False)
dataclass
¶
DoctorCheck(category, label, probe_cmd, evaluate, fix_cmd=None, fix_description='', host_side=False)
dataclass
¶
A single health check to run inside (or against) a container.
The probe_cmd is executed via podman exec <cname> ... by the
orchestrator. The evaluate callable interprets the result.
If fix_cmd is set, the orchestrator may offer it when the check
fails with fixable=True.
Dual execution modes:
- Container mode (
host_side=False): the orchestrator runsprobe_cmdviapodman execand passes the result toevaluate. The standalonedoctorcommand runs the sameprobe_cmddirectly viasubprocesson the host. - Host-side mode (
host_side=True): the orchestrator bypassesprobe_cmdentirely and performs the check via Python APIs (e.g.ShieldManager), then passes resolved state toevaluate. The standalonedoctorcommand callsevaluate(0, "", "")and the function performs the check itself or reports a neutral result.
category
instance-attribute
¶
Grouping key: "bridge", "env", "mount", "network",
"shield", "git".
label
instance-attribute
¶
Human-readable check name shown in output.
probe_cmd
instance-attribute
¶
Shell command to run inside the container via podman exec.
evaluate
instance-attribute
¶
(returncode, stdout, stderr) → CheckVerdict.
fix_cmd = None
class-attribute
instance-attribute
¶
Optional remediation command for podman exec.
fix_description = ''
class-attribute
instance-attribute
¶
Shown to the operator before applying the fix.
host_side = False
class-attribute
instance-attribute
¶
If True, the check runs on the host (not via podman exec).
The orchestrator calls evaluate(0, "", "") and the evaluate
function performs the host-side check itself.
sandbox_doctor_checks(*, token_broker_port=None, ssh_signer_port=None, desired_shield_state=None)
¶
Return sandbox-level health checks for in-container diagnostics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
token_broker_port
|
int | None
|
Token broker TCP port (skip check if |
None
|
ssh_signer_port
|
int | None
|
SSH signer TCP port (skip check if |
None
|
desired_shield_state
|
str | None
|
Expected shield state from |
None
|
Returns:
| Type | Description |
|---|---|
list[DoctorCheck]
|
List of |
Source code in src/terok_sandbox/doctor.py
make_recovery_acknowledged_check()
¶
Warn when the operator hasn't confirmed they saved the recovery key.
Two severity bands depending on the resolved tier when the marker
is absent — the session-file tier dies on the next reboot, so
"unconfirmed AND session-only" is a genuine error (you are
literally one reboot away from losing the vault), while every
durable tier (keyring, systemd-creds, config) is "only" a warn
(machine-bound; needs an off-host copy for disaster recovery).
Intentionally NOT bundled into
sandbox_doctor_checks:
that list is consumed per-container by terok's sickbay, and a
host-bound recovery check would render once per task. Top-level
callers (the terok-sandbox doctor CLI, terok's host-level
sickbay row) invoke this factory directly so the check renders
exactly once.