terok_util
terok_util
¶
terok-util — shared utility library for the terok-* sibling packages.
terok-util sits at the bottom of the terok dependency chain. Every
sibling package depends on it; it depends on nothing else in the
ecosystem (only stdlib + platformdirs + ruamel.yaml).
It collects the small set of cross-cutting helpers that would otherwise
be duplicated — or, worse, quietly diverge — across
terok-shield,
terok-clearance,
terok-sandbox,
terok-executor,
and terok.
What lives here, by module:
cli_types— argparse-driven CLI registry vocabulary:CommandDef,ArgDef,CommandTree,KeyRow.fs— small filesystem helpers (ensure_dir,ensure_dir_writable,write_sensitive_file).logging— soft-fail file logger (BestEffortLogger).yaml— round-trip YAML facade overruamel.yaml(load,dump).paths— XDG-aware namespace path resolution (namespace_state_dir,namespace_config_dir,namespace_runtime_dir).config_stack— layered config merge engine (ConfigStack,deep_merge).security— untrusted-string TTY sanitiser (sanitize_tty).podman— rootless--userns=keep-idbuilder (podman_userns_args).
The rule for what belongs here: if two or more terok-* packages
need it, it lives in terok-util. Single-package helpers stay in
the package that owns them. The __all__ declaration below is the
contract — symbols listed are stable across minor releases; anything
underscore-prefixed or absent from __all__ is internal and may change
without notice.
__all__ = ['ArgDef', 'BestEffortLogger', 'CommandDef', 'CommandTree', 'ConfigStack', 'KeyRow', 'deep_merge', 'ensure_dir', 'ensure_dir_writable', 'config_file_paths', 'host_uid', 'namespace_config_dir', 'namespace_runtime_dir', 'namespace_state_dir', 'podman_userns_args', 'read_config_section', 'read_config_top_level', 'sanitize_tty', 'write_sensitive_file']
module-attribute
¶
ArgDef(name, help='', type=None, default=None, action=None, dest=None, nargs=None, required=False)
dataclass
¶
Definition of a single CLI argument.
name
instance-attribute
¶
help = ''
class-attribute
instance-attribute
¶
type = None
class-attribute
instance-attribute
¶
default = None
class-attribute
instance-attribute
¶
action = None
class-attribute
instance-attribute
¶
dest = None
class-attribute
instance-attribute
¶
nargs = None
class-attribute
instance-attribute
¶
required = False
class-attribute
instance-attribute
¶
CommandDef(name, help='', handler=None, args=(), children=(), group='', epilog='', extras=dict())
dataclass
¶
One node in a command tree — a leaf verb or a group of verbs.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Verb name as it appears on the CLI. |
help |
str
|
One-line help string. |
handler |
Callable[..., Any] | None
|
Callable implementing the verb. |
args |
tuple[ArgDef, ...]
|
Argument definitions parsed by argparse. |
children |
tuple[CommandDef, ...]
|
Sub-verbs. Non-empty makes this node a group. |
group |
str
|
Free-form tag used by per-subsystem grouping (unrelated
to the |
epilog |
str
|
Optional long-form text rendered after the argparse
argument list in |
extras |
Mapping[str, Any]
|
Bag of package-specific metadata downstream consumers
ignore (shield's |
A frozen-dataclass + structural sharing is the load-bearing part
of the wrap-once-share-everywhere story: when a consumer overlays
a handler at one path, the modified
CommandDef is referenced from
every shortcut that also points at that path. Identity is what
makes the overlay propagate.
name
instance-attribute
¶
help = ''
class-attribute
instance-attribute
¶
handler = None
class-attribute
instance-attribute
¶
args = ()
class-attribute
instance-attribute
¶
children = ()
class-attribute
instance-attribute
¶
group = ''
class-attribute
instance-attribute
¶
epilog = ''
class-attribute
instance-attribute
¶
extras = field(default_factory=dict)
class-attribute
instance-attribute
¶
is_group
property
¶
Whether this node carries children (i.e. is a verb group).
with_handler(handler)
¶
CommandTree(roots)
¶
A forest of CommandDef nodes.
The unit of composition for CLI registries: each package exposes its
own CommandTree; consumers walk it structurally, overlay
handlers where they wrap a concept, extend with their own verbs,
and wire the result into argparse.
Composition is identity-preserving — nodes the consumer doesn't
touch share object identity with their pre-overlay counterparts, so
a shortcut that splices the same subtree at the consumer's top
level reaches the same modified handler. terok shield install
and terok executor sandbox shield install resolving to the
same wrap is a direct consequence.
Build a tree from an iterable of top-level verbs/groups.
Source code in src/terok_util/cli_types.py
roots
property
¶
The top-level verbs in this tree, in declaration order.
__iter__()
¶
__len__()
¶
__add__(other)
¶
Concatenate forests — other's roots appended to this one's.
Source code in src/terok_util/cli_types.py
find_at(path)
¶
Return the CommandDef at path.
path is a sequence of verb names from the root. An empty
path is rejected (no synthetic root). KeyError if any
segment doesn't match a child name.
Source code in src/terok_util/cli_types.py
overlay(overrides)
¶
Return a new tree with handlers replaced at the named paths.
overrides maps verb-name tuples (e.g. ("vault", "status"))
to replacement handlers. Each match produces one new
CommandDef via replace;
ancestors are likewise replaced because their children
tuples now hold a new node, but unrelated siblings share
identity with the input tree.
Sandbox-vocab paths use the operator-facing verb names — same names you'd type on the CLI — so the override map reads like a routing table.
Source code in src/terok_util/cli_types.py
extend_at(path, additions)
¶
Return a new tree with additions appended at the path's children.
Empty path extends the top-level forest. Otherwise the
CommandDef at path
must be a group — a leaf (one with handler set and no
children) cannot be extended; trying to do so would produce
a hybrid node argparse can't represent (handler + subparsers
on the same parser). Raises TypeError rather than
silently inventing such a node.
Source code in src/terok_util/cli_types.py
walk()
¶
Yield (path, command) for every node in the tree, depth-first.
wire(target)
¶
Wire this tree's verbs as subparsers under target, recursively.
target may be either an
ArgumentParser (a fresh
add_subparsers() action is created) or an existing
argparse._SubParsersAction (the tree mounts straight under
it; the private name has no public docs target). The second
form lets a consumer mix legacy register-style subparsers
with structural
CommandTree ones under
the same root parser without colliding on argparse's
one-subparsers-per-parser rule.
The same CommandDef
wired at multiple positions (deep nesting + shortcuts) yields
independent argparse subparser instances, but each subparser's
dispatch reads back the same handler object — so concept
translations applied via overlay apply uniformly across
every entry point that references the modified node.
Source code in src/terok_util/cli_types.py
dispatch(args)
staticmethod
¶
Invoke the handler stored on args by CommandTree.wire.
Bridges argparse's parsed-args namespace to the handler kwargs
the CommandDef declared.
Async handlers are detected and run via asyncio.run so
consumers don't need separate dispatch paths per handler
flavour.
Source code in src/terok_util/cli_types.py
KeyRow
¶
ConfigStack()
¶
Ordered collection of config scopes, lowest-priority first.
Usage::
stack = ConfigStack()
stack.push(ConfigScope("global", global_path, global_data))
stack.push(ConfigScope("project", proj_path, proj_data))
resolved = stack.resolve()
Initialise an empty config stack.
Source code in src/terok_util/config_stack.py
scopes
property
¶
Read-only access to the scope list (for diagnostics).
push(scope)
¶
resolve()
¶
Deep-merge all scopes in order and return the result.
resolve_section(key)
¶
Resolve only a single top-level section across all scopes.
Respects the same semantics as
resolve — in
particular, None values trigger deletion via
deep_merge.
Returns {} when the highest-priority scope has a non-dict
value for key (e.g. services: tcp instead of
services: {mode: tcp}). Callers can call
resolve and
inspect the raw shape if they need to distinguish "missing"
from "wrong-shape", but resolve_section is contract-typed
as a mapping accessor and so coerces non-mappings to empty.
Source code in src/terok_util/config_stack.py
BestEffortLogger(log_path_fn)
¶
Append timestamped lines to a state-file log; soft-fail on any error.
The destination is supplied as a callable rather than an eager
Path so XDG / env-var overrides applied between construction
and write time still take effect.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
log_path_fn
|
Callable[[], Path]
|
Zero-arg callable returning the destination path.
Called on every write so tests overriding |
required |
Bind the destination resolver.
Source code in src/terok_util/logging.py
log(message, *, level='DEBUG')
¶
Append one [timestamp] LEVEL: message line. Never raises.
File creation goes through os.open with mode 0o600 so the
log lands owner-only by construction — atomically, without
relying on the process umask. The mode bits are honoured by
the kernel only on creation; existing files keep whatever perms
they were created with.
Source code in src/terok_util/logging.py
debug(message)
¶
warning(message)
¶
warn_user(component, message)
¶
Print a structured warning to stderr and append it to the log file.
Stderr output is run through
sanitize_tty so attacker
bytes in component / message (e.g. originating from foreign
config files) can't smuggle terminal escapes into the operator's
terminal. The file-side write is unsanitised so the log keeps
the original bytes for forensic review.
Source code in src/terok_util/logging.py
deep_merge(base, override)
¶
Recursively merge override into base, returning a new dict.
Rules¶
- Dicts are merged recursively by default.
- A
Nonevalue in override deletes the corresponding key. - A bare
"_inherit"string keeps the base value unchanged (equivalent to omitting the key, but explicit). - Lists in override replace the base list wholesale unless the
list contains the sentinel string
"_inherit", in which case the sentinel is replaced by the base list elements (splice). - A dict in override that contains
_inherit: truekeeps all parent keys and overlays the rest (the_inheritkey itself is stripped from the result).
Source code in src/terok_util/config_stack.py
ensure_dir(path)
¶
ensure_dir_writable(path, label)
¶
Create path if needed and verify it is writable, or exit with an error.
Source code in src/terok_util/fs.py
write_sensitive_file(path, content)
¶
Atomically create path with mode 0o600 and write content.
Returns True if the file was created, False if it already existed.
Parent directories are created with mode 0o700.
Refuses to operate if path.parent is a symbolic link — chmod would
otherwise follow the link target. Opens the file with O_NOFOLLOW
so a planted symlink at the final path cannot redirect the write.
Source code in src/terok_util/fs.py
config_file_paths()
¶
Ordered config.yml locations with scope labels (lowest → highest priority).
TEROK_CONFIG_FILE → single override (no layering). Otherwise:
/etc/terok/config.yml (system) → ~/.config/terok/config.yml
(user). Root processes see only the system path.
Public so consumers can render an "edit one of these to override X" hint to the operator (which file gets the highest priority, where on disk the operator would put the override, etc.).
Source code in src/terok_util/paths.py
host_uid()
¶
Return the current process's UID as the initial user namespace sees it.
Inside an unprivileged user namespace (rootless podman / crun
hook, sandboxed CI runner, unshare -U), os.geteuid() returns
the inner-userns UID — typically 0 even when the operator
really ran the program as UID 1000. Network peers (D-Bus
SO_PEERCRED, AUTH EXTERNAL) and kernel-level checks see the
outer (host) UID via the userns uid_map translation, so a
process that advertises its inner UID over the wire is rejected for
a credential mismatch. This helper hands callers the outer UID those
peers expect.
The mapping comes from /proc/self/uid_map. When it is
unavailable (macOS, BSD, exotic chroot) or no row covers the
effective UID, the bare geteuid() answer is returned — correct on
systems without Linux user namespaces.
Source code in src/terok_util/paths.py
namespace_config_dir(subdir='', *, env_var=None)
¶
Resolve a config directory under the terok/ namespace.
Priority: env_var → /etc/terok/<subdir> (root) → platformdirs
→ ~/.config/terok/<subdir>. env_var is keyword-only.
Source code in src/terok_util/paths.py
namespace_runtime_dir(subdir='', *, env_var=None)
¶
Resolve a runtime directory under the terok/ namespace.
Priority: env_var → /run/terok/<subdir> (root)
→ $XDG_RUNTIME_DIR/terok/<subdir> → $XDG_STATE_HOME/terok/<subdir>
→ ~/.local/state/terok/<subdir>. env_var is keyword-only.
Source code in src/terok_util/paths.py
namespace_state_dir(subdir='', *, env_var=None)
¶
Resolve a state directory under the terok/ namespace.
Priority:
- env_var (package-specific override, e.g.
TEROK_SANDBOX_STATE_DIR) TEROK_ROOTenv var (namespace override)config.yml→paths.root(Podman model — all packages honour it)- Platform default (
/var/lib/terok/<subdir>for root, XDG data dir otherwise)
env_var is keyword-only so a positional second argument can never accidentally be reinterpreted as an override name.
Source code in src/terok_util/paths.py
read_config_section(section)
¶
Read a top-level section from layered terok configs (cached, fail-silent).
Merges system and user config.yml files via
ConfigStack — user values
override system defaults at the leaf level. Lazy-imports
config_stack so importing paths doesn't drag the YAML
parser into a process that only needs the platform defaults.
Source code in src/terok_util/paths.py
read_config_top_level(key)
¶
Read a top-level scalar / list / mapping from layered terok configs.
Counterpart to
read_config_section for
keys whose value isn't a dict — e.g. the ecosystem-wide
experimental: true opt-in or a bare log_level: debug knob.
Returns the merged value (user wins over system) or None when
the key is absent or the config files can't be loaded. Cached for
the lifetime of the process; reaches for the _config_top_level_cache
private to flush in tests.
Source code in src/terok_util/paths.py
podman_userns_args()
¶
Return user namespace args for rootless podman so UID 1000 maps correctly.
Maps the host user to container UID/GID 1000, the conventional non-root
dev user in terok container images. Returns an empty list when
running as root — root podman runs in the host user namespace and
has no need to remap.
Source code in src/terok_util/podman.py
sanitize_tty(s)
¶
Replace terminal control characters with safe representations.
Whitespace controls (newline, carriage return, tab) become spaces.
All other characters in Unicode category C (control, format,
surrogate, private use, unassigned) are rendered as \xNN hex
escapes. Printable text passes through unchanged.