runtime
runtime
¶
Container runtime surface — protocol + concrete backends.
Public imports live here; callers should never reach into the backend modules directly.
DEFAULT_GUEST_SSHD_PORT = 22
module-attribute
¶
DEFAULT_SSH_HOST = '127.0.0.1'
module-attribute
¶
DEFAULT_SSH_USER = 'dev'
module-attribute
¶
__all__ = ['Container', 'ContainerRemoveResult', 'ContainerRuntime', 'ExecResult', 'Image', 'LogStream', 'PortReservation', 'KrunRuntime', 'NullRuntime', 'PodmanRuntime', 'KrunContainer', 'FakeKrunTransport', 'KrunTransport', 'TcpEndpoint', 'TcpSSHTransport', 'podman_port_resolver', 'DEFAULT_GUEST_SSHD_PORT', 'DEFAULT_SSH_HOST', 'DEFAULT_SSH_USER', 'GpuConfigError', 'check_gpu_available']
module-attribute
¶
FakeKrunTransport()
¶
In-memory KrunTransport for tests.
Mirrors NullRuntime's
pre-register-then-replay shape so tests that already understand the
null backend pick this up by analogy. Records every call so tests
can assert dispatch without a real sshd listener.
Source code in src/terok_sandbox/runtime/krun.py
exec_calls = []
instance-attribute
¶
exec_stdio_calls = []
instance-attribute
¶
set_result(container_name, cmd, result)
¶
Pre-register the result exec
returns for exact cmd on container_name.
Source code in src/terok_sandbox/runtime/krun.py
exec(container, cmd, *, timeout=None)
¶
Return a pre-registered result, or empty success.
Source code in src/terok_sandbox/runtime/krun.py
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Record the call and return exit code 0 (no I/O is moved).
Source code in src/terok_sandbox/runtime/krun.py
login_command(container, *, command=())
¶
Return a placeholder argv ["fake-login", <name>, *command].
Tests assert on the shape; no real transport is contacted.
Source code in src/terok_sandbox/runtime/krun.py
KrunContainer(name, *, runtime, transport)
¶
Bases: PodmanContainer
Container handle for krun-managed microVMs.
Subclasses PodmanContainer
because podman --runtime krun honours every lifecycle verb
(state, start/stop, logs, inspect) — only login_command diverges,
since podman exec can't enter the guest. That single override
routes through the held KrunTransport
so the operator gets an SSH argv that actually reaches in.
Source code in src/terok_sandbox/runtime/krun.py
login_command(*, command=())
¶
Return the transport's interactive-attach argv for this container.
KrunRuntime(*, transport, podman=None)
¶
Container runtime that launches tasks inside KVM microVMs.
Composition, not inheritance: holds a
PodmanRuntime for every
lifecycle verb (podman --runtime krun is just podman driving a
different OCI runtime) and a
KrunTransport for the
one verb that can't go through podman — exec.
The transport is required: there is no sensible default beyond a
real SSH-over-passt-TCP implementation, and the fake exists explicitly
for tests. Production callers wire the real transport at the
ContainerRuntime selection point
in the orchestrator.
Source code in src/terok_sandbox/runtime/krun.py
container(name)
¶
Return a KrunContainer
handle wrapping the podman container — same lifecycle, krun-aware
login_command.
Return type stays the Container
Protocol rather than the narrower concrete class: mypy treats
Protocol method return types as invariant, so a narrower
annotation breaks structural ContainerRuntime matching for
downstream consumers (terok's _runtime: ContainerRuntime
assignment was the loud failure). The runtime value is
genuinely a KrunContainer — callers needing the concrete
type cast at the call site.
Source code in src/terok_sandbox/runtime/krun.py
containers_with_prefix(prefix)
¶
Same prefix lookup as podman; rewrap each handle as a
KrunContainer so its
login_command routes through the TCP-SSH transport.
Same Protocol-invariance rationale as
container
for the wider declared return type.
Source code in src/terok_sandbox/runtime/krun.py
image(ref)
¶
images(*, dangling_only=False)
¶
exec(container, cmd, *, timeout=None)
¶
Route to the transport — typically SSH-over-passt-TCP.
Source code in src/terok_sandbox/runtime/krun.py
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Route stdio-bridged exec to the transport.
Source code in src/terok_sandbox/runtime/krun.py
force_remove(containers)
¶
KrunTransport
¶
Bases: Protocol
How KrunRuntime reaches into a
running microVM to run commands.
The exec divergence is forced by libkrun: a microVM is sealed after boot and cannot accept injected processes. The real implementation speaks SSH to a sshd inside the guest, reachable through a per-task host TCP port that podman's passt has forwarded into the guest namespace; that is wire-protocol shaped, not in-tree code we want to invent.
Kept narrow on purpose — only the two operations
ContainerRuntime
needs to route through it. Lifecycle stays on the podman side.
exec(container, cmd, *, timeout=None)
¶
Run cmd inside the guest backed by container; return its outcome.
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Bridge byte streams to cmd inside the guest; return its exit code.
Source code in src/terok_sandbox/runtime/krun.py
login_command(container, *, command=())
¶
Return an argv for os.execvp to attach interactively.
Mirrors the protocol method on
Container.login_command
but routed through the transport so the krun runtime can hand the
operator an SSH invocation instead of podman exec.
Source code in src/terok_sandbox/runtime/krun.py
TcpEndpoint(port, host=DEFAULT_SSH_HOST)
dataclass
¶
A host TCP endpoint reachable via podman's passt port-forward.
port is the host-side TCP port podman bound for this container's
-p <port>:22 mapping; host is the loopback address that port
was bound to.
Fields are int-coerced and range-checked in __post_init__ — the
transport interpolates port into the ssh argv and host into the
user@host token, so a string carrying shell metacharacters or
structural junk would otherwise reach the system ssh CLI. Catching
it here means a bad endpoint_resolver fails loudly at
construction rather than silently building a hostile invocation.
port
instance-attribute
¶
host = DEFAULT_SSH_HOST
class-attribute
instance-attribute
¶
__post_init__()
¶
Coerce + bound-check both fields so the ssh argv stays safe.
Source code in src/terok_sandbox/runtime/krun_transport.py
TcpSSHTransport(*, identity_file, endpoint_resolver, ssh_user=DEFAULT_SSH_USER, ssh_binary='ssh')
¶
OpenSSH-over-loopback-TCP implementation of
KrunTransport.
Holds the host-side identity (private key path) and an endpoint
resolver that maps a Container
to a TcpEndpoint.
The transport never touches the credentials vault directly — the
orchestrator exports the %host key to a tmpfs file and passes
that path in, keeping vault access out of the runtime layer.
Source code in src/terok_sandbox/runtime/krun_transport.py
exec(container, cmd, *, timeout=None)
¶
Run cmd in the guest and return its outcome.
Each cmd token is shlex.quoted into a single remote
command string so the in-guest shell treats embedded
metacharacters as literal data — argv semantics are preserved
across the inherently-shell-parsed ssh wire format.
Source code in src/terok_sandbox/runtime/krun_transport.py
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Bridge byte streams to cmd in the guest; return its exit code.
Environment variables are propagated via a remote env prefix
rather than SendEnv so the transport doesn't depend on the
guest's AcceptEnv whitelist. Env var names are
validated against [A-Za-z_][A-Za-z0-9_]* because the remote
env command expects bare identifiers; values and cmd
tokens are shlex.quoted so embedded shell metacharacters
cross the wire as literal data.
Source code in src/terok_sandbox/runtime/krun_transport.py
login_command(container, *, command=())
¶
Return an ssh argv that attaches a PTY to the guest's shell.
Mirrors what PodmanContainer.login_command
does for the conventional runtime — emits the argv the operator
(or terok login) execs into. Adds -tt so sshd allocates
a real PTY even when stdin isn't a terminal (the caller may be
running under tmux or an IDE proxy).
Both the empty-command path (interactive login → bash -l)
and the explicit-command path land at /workspace via
_at_workspace, so the operator's starting cwd matches what
podman exec gives under crun. Argv tokens past -- are shlex.quoted
(same helper the exec paths use) so the SSH wire format
preserves argv semantics across the login-shell parse on the
far side.
Source code in src/terok_sandbox/runtime/krun_transport.py
NullRuntime()
¶
Stub ContainerRuntime for tests and dry-run modes.
All state lives in dictionaries on the runtime instance. Tests
pre-populate fixtures via the set_container_state,
add_image, etc. helpers.
Source code in src/terok_sandbox/runtime/null.py
set_container_state(name, state)
¶
set_container_image(name, image_ref)
¶
set_container_rw_size(name, bytes_)
¶
set_exit_code(name, code)
¶
Record the exit code Container.wait will return for name.
set_ready_result(name, ready)
¶
Record the outcome Container.stream_initial_logs returns.
add_image(ref, *, repository='', tag='', size='', created='', labels=None, history=())
¶
Register an image fixture.
Source code in src/terok_sandbox/runtime/null.py
set_exec_result(container_name, cmd, result)
¶
Pre-register the result exec returns for exact cmd.
Source code in src/terok_sandbox/runtime/null.py
set_exec_stdio_script(container_name, cmd, script, *, exit_code=0)
¶
Pre-register a stdio interaction for exec_stdio.
script is a sequence of ("read", bytes) / ("write", bytes)
steps replayed in order: read consumes the matching prefix from
the caller-supplied stdin; write emits the bytes to stdout.
Use this to drive deterministic ACP-handshake tests without spinning
up a real container.
Source code in src/terok_sandbox/runtime/null.py
container(name)
¶
Return a NullContainer handle.
containers_with_prefix(prefix)
¶
Return fixtures whose name starts with prefix-.
Source code in src/terok_sandbox/runtime/null.py
image(ref)
¶
images(*, dangling_only=False)
¶
Return fixture images; dangling_only filters by tag == "<none>".
Source code in src/terok_sandbox/runtime/null.py
exec(container, cmd, *, timeout=None)
¶
Return a pre-registered result, or a default empty success.
Source code in src/terok_sandbox/runtime/null.py
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Replay a pre-registered stdio script, or no-op with exit code 0.
Records every call (with env) for test inspection. When a script is
registered for (container, cmd), replays it in order: read
consumes from stdin and asserts a match; write pushes bytes to
stdout. Without a script, returns immediately with exit code 0
— matches the empty-success default of exec.
Source code in src/terok_sandbox/runtime/null.py
force_remove(containers)
¶
Record the call and clear every fixture for each container.
Source code in src/terok_sandbox/runtime/null.py
reserve_port(host='127.0.0.1')
¶
Reserve a real host port (even null backend callers want a live port).
GpuConfigError(message, *, hint=_CDI_HINT)
¶
Bases: RuntimeError
CDI/NVIDIA misconfiguration detected during container launch.
Store the CDI hint alongside the standard error message.
Source code in src/terok_sandbox/runtime/podman.py
hint = hint
instance-attribute
¶
PodmanRuntime
¶
The default ContainerRuntime — talks to the podman CLI.
container(name)
¶
containers_with_prefix(prefix)
¶
Return handles for every container whose name starts with prefix-.
Single podman ps -a call under the hood; the returned handles
are lazy (fresh inspect on property access).
Source code in src/terok_sandbox/runtime/podman.py
image(ref)
¶
images(*, dangling_only=False)
¶
Enumerate local images.
dangling_only narrows to untagged <none>:<none> entries.
Source code in src/terok_sandbox/runtime/podman.py
exec(container, cmd, *, timeout=None)
¶
Run cmd inside container via podman exec.
Lets FileNotFoundError (podman missing) and
subprocess.TimeoutExpired propagate unchanged.
Raises ValueError if cmd is empty — podman exec with
no argv is never a valid request and catching it here avoids a
later IndexError in the debug log.
Source code in src/terok_sandbox/runtime/podman.py
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Bridge byte streams to podman exec -i for cmd inside container.
Synchronous: spawns the child, runs three daemon pump threads
(one per direction) copying bytes until either side reaches
EOF or the child exits, joins the pumps, returns the exit code.
Async callers drive this via
run_in_executor.
Lets FileNotFoundError (podman missing) propagate. On
timeout, terminates the child (terminate → 2 s wait → kill) and
re-raises TimeoutExpired.
Source code in src/terok_sandbox/runtime/podman.py
force_remove(containers)
¶
Best-effort podman rm -f of each container.
Continues through individual failures. An already-absent container counts as removed — the post-condition holds.
Source code in src/terok_sandbox/runtime/podman.py
reserve_port(host='127.0.0.1')
¶
container_states(prefix)
¶
Return {container_name: state} for matching containers.
Optimisation over [c.state for c in containers_with_prefix(prefix)]
— single podman ps -a instead of N inspects. Backend-specific;
not part of the ContainerRuntime protocol.
Source code in src/terok_sandbox/runtime/podman.py
container_rw_sizes(prefix)
¶
Return {container_name: rw_bytes} for matching containers.
Single podman ps --size call — --size is expensive (overlay
diffs) but one bulk call beats N inspects. Backend-specific; not
part of the ContainerRuntime protocol.
Source code in src/terok_sandbox/runtime/podman.py
Container
¶
Bases: Protocol
Handle to a container managed by a ContainerRuntime.
Handles are cheap — construction does not verify that the container
exists. Operations return sensible defaults (None, False, [])
when the underlying container is absent, matching podman's own semantics.
name
instance-attribute
¶
state
property
¶
Lifecycle state ("running", "exited", ...) or None.
running
property
¶
Shortcut: state == "running".
image
property
¶
Handle to the image this container was created from, or None.
rw_size
property
¶
Writable-layer size in bytes, or None if unavailable.
start()
¶
Start the container. Raises RuntimeError on failure.
stop(*, timeout=10)
¶
wait(timeout=None)
¶
Block until the container exits; return its exit code.
Raises TimeoutError when timeout elapses.
copy_in(src, dest)
¶
login_command(*, command=())
¶
Return an argv suitable for os.execvp to attach interactively.
Empty command uses the backend default (typically tmux
new-session -A -s main).
Source code in src/terok_sandbox/runtime/protocol.py
logs(*, follow=False, tail=None)
¶
stream_initial_logs(ready_check, timeout_sec)
¶
Stream logs until ready_check returns True or timeout_sec.
Returns True if the ready marker was seen, False on timeout.
Each line is printed to stdout as it is received.
Source code in src/terok_sandbox/runtime/protocol.py
ContainerRemoveResult(name, removed, error=None)
dataclass
¶
Per-container outcome from ContainerRuntime.force_remove.
ContainerRuntime
¶
Bases: Protocol
The container runtime — factory for handles, plus operations that have no single-object receiver.
One instance per process, typically constructed at the top-level entry
point and threaded down through higher layers (Sandbox, executor's
AgentRunner, terok's CLI/TUI).
container(name)
¶
Return a handle to the container named name.
Does not verify existence; call Container.state for that.
containers_with_prefix(prefix)
¶
image(ref)
¶
Return a handle to the image identified by tag or ID ref.
Does not verify existence; call Image.exists for that.
images(*, dangling_only=False)
¶
Enumerate local images.
dangling_only narrows to untagged images (those listed as
<none>:<none>).
exec(container, cmd, *, timeout=None)
¶
Run cmd inside container and return its completion record.
The operation that diverges most across backends: podman uses
podman exec; the krun backend uses SSH over a passt-forwarded
TCP port.
Source code in src/terok_sandbox/runtime/protocol.py
exec_stdio(container, cmd, *, stdin, stdout, stderr=None, env=None, timeout=None)
¶
Run cmd inside container with stdio bridged to caller-supplied streams.
Forwards bytes bidirectionally between stdin/stdout/stderr and the
spawned process — distinct from exec, which captures output into
an ExecResult. Used by the host-side ACP proxy to bridge a Unix
socket to an in-container ACP-stdio agent without the runtime ever
materialising the conversation.
Blocks until the child exits; returns the exit code. EOF on either side terminates forwarding cleanly. Implementations are expected to be transport-agnostic — stdin/stdout are arbitrary byte streams (a socket's file-object face, a pipe end, a test buffer).
Source code in src/terok_sandbox/runtime/protocol.py
force_remove(containers)
¶
Forcibly stop and remove containers.
Best-effort — continues through individual failures and returns
one ContainerRemoveResult per input. An already-absent
container counts as removed (the post-condition holds).
Source code in src/terok_sandbox/runtime/protocol.py
reserve_port(host='127.0.0.1')
¶
Reserve a free TCP port on host.
The returned PortReservation exposes the port number via
reservation.port and releases the socket on close. Use to
pass a pre-reserved port to an external process.
Source code in src/terok_sandbox/runtime/protocol.py
ExecResult(exit_code, stdout, stderr)
dataclass
¶
Outcome of ContainerRuntime.exec.
Backend-neutral so the SSH-over-passt krun backend can fill it from
an SSH response without pretending to be a subprocess.CompletedProcess.
Image
¶
Bases: Protocol
Handle to a local container image. Cheap to construct.
ref
instance-attribute
¶
Tag ("terok-l2-cli:abcd") or ID ("sha256:...") used on lookup.
id
property
¶
Resolved image ID, or None if the image is not present.
repository
property
¶
Repository portion of the tag ("<none>" for dangling).
tag
property
¶
Tag portion ("<none>" for dangling).
size
property
¶
Podman-rendered human-readable size ("1.2GB").
created
property
¶
Podman-rendered creation timestamp.
exists()
¶
labels()
¶
history()
¶
LogStream
¶
Bases: Protocol
Context-managed iterator over decoded log lines.
__exit__ releases the backing process (or the krun-backend
equivalent). Safe to use in a with block plus for line in
stream loop.
PortReservation
¶
Bases: Protocol
Context manager for a reserved host-side TCP port.
The port is held open for the lifetime of the reservation; closing releases it. Use to pass a port number to an external process that will bind it shortly.
podman_port_resolver(*, guest_port=DEFAULT_GUEST_SSHD_PORT, host=DEFAULT_SSH_HOST)
¶
Return a resolver that reads the forwarded host port via podman port.
The orchestrator launches the container with -p <reserved>:22;
podman already records that mapping in its own metadata, so this
resolver just asks for it back — no terok-private annotation in the
middle. podman port <name> <guest_port>/tcp emits a single
<host_ip>:<host_port> line per matching mapping, which is
exactly what we need.
The resolved host is overridden to host (loopback by default) so
the SSH connect goes through 127.0.0.1 even when pasta bound
the forward to 0.0.0.0; trusting whatever podman reports would
open the door to reaching the guest via a routable interface.
Source code in src/terok_sandbox/runtime/krun_transport.py
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | |
check_gpu_available()
¶
Return True when a CDI spec declares the nvidia.com/gpu kind.
Wizards call this to decide whether to offer the NVIDIA base image;
the on-launch check_gpu_error
path is the authoritative one and stays in place. Any failure
(missing podman, missing CDI dirs, unreadable spec) collapses to
False so callers can treat this as a pure yes/no signal.