Skip to content

Terok executor

terok_executor

terok-executor: single-agent task runner for hardened Podman containers.

Builds agent images, launches instrumented containers, and manages the lifecycle of one AI coding agent at a time. Designed for standalone use (terok-executor run claude .) and as a library for terok orchestration.

The public surface is __all__ below. Key entry points:

Implementation-detail types (raw config schema fragments, ACP error classes, internal result types, sidecar image / inject helpers) stay in their submodules; reach into terok_executor.<sub> when you need them.

__version__ = _meta_version('terok-executor') module-attribute

COMMANDS = CommandTree(OWN_COMMANDS + (CommandDef(name='sandbox', help='Sandbox subsystem (full deep tree — same verbs as terok-sandbox)', children=(SANDBOX_TREE.roots)),) + VAULT_COMMANDS) module-attribute

AGENT_COMMANDS = (RUN_COMMAND, RUN_TOOL_COMMAND, AUTH_COMMAND, AGENTS_COMMAND, BUILD_COMMAND, SETUP_COMMAND, UNINSTALL_COMMAND, LIST_COMMAND, STOP_COMMAND, SHOW_CONFIG_COMMAND, ACP_COMMAND) module-attribute

AGENTS_LABEL = 'ai.terok.agents' module-attribute

OCI label naming the roster entries baked into an L1 image.

DEFAULT_BASE_IMAGE = 'fedora:44' module-attribute

Default base OS image when none is specified.

AUTH_PROVIDERS = {} module-attribute

All known auth providers (agents + tools), keyed by name. Loaded from resources/agents/*.yaml.

VAULT_COMMANDS = (SANDBOX_TREE.find_at(('vault',)),) module-attribute

AGENT_PROVIDERS = {} module-attribute

All agent providers, keyed by name. Loaded from resources/agents/*.yaml.

PROVIDER_NAMES = () module-attribute

__all__ = ['__version__', 'ACPEndpointStatus', 'acp_socket_is_live', 'list_authenticated_agents', 'AGENT_PROVIDERS', 'AgentProvider', 'CLIOverrides', 'PROVIDER_NAMES', 'get_provider', 'resolve_provider_value', 'AgentConfigSpec', 'parse_md_agent', 'prepare_agent_config_dir', 'AUTH_PROVIDERS', 'Authenticator', 'AuthSession', 'prepare_oauth_session', 'store_api_key', 'bundled_default_instructions', 'resolve_instructions', 'ConfigScope', 'ConfigStack', 'ExecutorConfigView', 'RawImageSection', 'AGENTS_LABEL', 'DEFAULT_BASE_IMAGE', 'BuildError', 'ImageBuilder', 'ImageSet', 'build_project_image', 'scan_leaked_credentials', 'AgentRoster', 'AGENT_COMMANDS', 'COMMANDS', 'VAULT_COMMANDS', 'SharedMountStorageInfo', 'TaskStorageInfo', 'AgentRunner', 'ContainerEnvSpec', 'assemble_container_env', 'inject_prompt', 'seed_workspace_from_clone_cache', 'ensure_sandbox_ready', 'KrunHost', 'KrunHostKeypair', 'ensure_krun_host_keypair'] module-attribute

ACPEndpointStatus

Bases: StrEnum

Live state of a per-task ACP endpoint.

The host classifier (Project.acp_endpoints) attaches one of these to every running task; the value drives both the rendered row in acp list and the decision acp connect makes about whether to spawn a daemon.

ACTIVE = 'active' class-attribute instance-attribute

Daemon up, socket bound, ready for client connections.

READY = 'ready' class-attribute instance-attribute

Task running with at least one authenticated agent — a daemon will spawn on first terok acp connect.

UNSUPPORTED = 'unsupported' class-attribute instance-attribute

Task running but no in-image agents are authenticated. Connect would fail; surface honestly so the user knows to authenticate.

ExecutorConfigView

Bases: SandboxConfigView

The slice of config.yml executor owns + sandbox owns (transitively).

Inherits all eight sandbox-owned sections from SandboxConfigView and adds the executor-owned image: section. extra="allow" keeps the view tolerant of foreign top-level keys (terok's tui: / logs: / tasks: / git: / hooks:) — standalone terok-executor run flows don't crash on a complete ecosystem config, no need to vendor a list of terok's section names here.

terok's RawGlobalConfig inherits from this class and flips back to extra="forbid": the topmost layer knows every section, so a typo at the top level is caught there.

The class also exposes staticmethods for reading and writing the image: section on disk: image_agents(), image_base_image(), and set_image_agents(selection). The schema thus owns both the shape and the canonical accessors for its owned section, rather than scattering one helper per operation across a separate config module.

model_config = ConfigDict(extra='allow') class-attribute instance-attribute

image = Field(default_factory=RawImageSection) class-attribute instance-attribute

image_agents() staticmethod

Return the effective image.agents, or None when unset.

None distinguishes "field absent" from "all" (the explicit "every roster entry" selector).

Source code in src/terok_executor/config_schema.py
@staticmethod
def image_agents() -> str | None:
    """Return the effective ``image.agents``, or ``None`` when unset.

    ``None`` distinguishes "field absent" from ``"all"`` (the
    explicit "every roster entry" selector).
    """
    from terok_util import read_config_section

    return read_config_section("image").get("agents") or None

image_base_image() staticmethod

Return the explicit image.base_image, or None when unset.

Callers apply the schema fallback themselves (DEFAULT_BASE_IMAGE) — keeping that constant out of this module preserves the foundation-layer boundary (schema sits below container/build).

Source code in src/terok_executor/config_schema.py
@staticmethod
def image_base_image() -> str | None:
    """Return the explicit ``image.base_image``, or ``None`` when unset.

    Callers apply the schema fallback themselves
    ([`DEFAULT_BASE_IMAGE`][terok_executor.DEFAULT_BASE_IMAGE]) —
    keeping that constant out of this module preserves the
    foundation-layer boundary (schema sits below container/build).
    """
    from terok_util import read_config_section

    return read_config_section("image").get("base_image") or None

set_image_agents(selection) staticmethod

Write selection into image.agents and return the file path.

Caller validates selection up-front (typically via AgentRoster.validate_selection).

Invalidates terok-util's process-wide read_config_section cache before returning so the next image_agents() / image_base_image() call observes the freshly-written value rather than the in-memory snapshot taken before the write.

Source code in src/terok_executor/config_schema.py
@staticmethod
def set_image_agents(selection: str) -> Path:
    """Write *selection* into ``image.agents`` and return the file path.

    Caller validates *selection* up-front (typically via
    [`AgentRoster.validate_selection`][terok_executor.AgentRoster.validate_selection]).

    Invalidates terok-util's process-wide ``read_config_section``
    cache before returning so the next ``image_agents()`` /
    ``image_base_image()`` call observes the freshly-written value
    rather than the in-memory snapshot taken before the write.
    """
    from terok_util import paths as _util_paths

    from terok_executor.config import writable_config_path
    from terok_executor.integrations.sandbox import yaml_update_section

    path = writable_config_path()
    yaml_update_section(path, "image", {"agents": selection})
    _util_paths._config_section_cache.clear()
    return path

RawImageSection

Bases: BaseModel

The image: section — base image, agent roster, Dockerfile snippets.

Strict on its own keys (extra="forbid"). Same shape used in both the global config.yml (defaults across projects) and per-project project.yml (project overrides).

model_config = ConfigDict(extra='forbid') class-attribute instance-attribute

base_image = Field(default='fedora:44', description='Base container image for builds') class-attribute instance-attribute

family = Field(default=None, description='Package family for the L0/L1 build (``deb`` or ``rpm``). Leave unset to auto-detect from *base_image*; set explicitly when the image is outside the known allowlist.') class-attribute instance-attribute

agents = Field(default=None, description='Comma-separated roster entries to install in L1, or "all". Prefix a name with "-" to exclude it from the selection (e.g. "all,-vibe" or just "-vibe" — both mean "everything except vibe"). Inherits from the global config when unset.') class-attribute instance-attribute

user_snippet_inline = Field(default=None, description='Inline Dockerfile snippet injected into the project image') class-attribute instance-attribute

user_snippet_file = Field(default=None, description='Path to a file containing a Dockerfile snippet') class-attribute instance-attribute

BuildError

Bases: RuntimeError

Raised when base-image construction cannot complete.

The CLI maps this to a user-facing error message; library callers can catch it without being terminated by SystemExit.

ImageBuilder(base_image=DEFAULT_BASE_IMAGE, family=None) dataclass

Build pipeline for terok agent container images.

Holds the (base_image, family) the L0/L1/L2 build stack is anchored on. Build operations are instance-bound; pure family detection, image introspection, and resource staging stay as static methods — they don't depend on the builder's state.

Two scopes of operations:

  • Instance methods — apply self.base_image and self.family to a podman build (build_base, build_sidecar, ensure_default_l1), tag computations (l0_tag, l1_tag, l1_sidecar_tag), and Dockerfile rendering (render_l0, render_l1, render_l1_sidecar).
  • Static methods — pure helpers that operate on arbitrary inputs: detect_family, image_agents, stage_scripts, stage_tmux_config, stage_toad_agents.

base_image = DEFAULT_BASE_IMAGE class-attribute instance-attribute

Base OS image the L0 layer FROMs (e.g. fedora:44).

family = None class-attribute instance-attribute

Package family override ("deb" / "rpm") — auto-detected when None.

l0_tag property

L0 image tag for self.base_image.

l1_sidecar_tag property

L1 sidecar image tag for self.base_image.

build_base(*, agents='all', rebuild=False, full_rebuild=False, build_dir=None, tag_as_default=False)

Build L0 + L1 images for agents; returns the resulting tag pair.

See module-level build_base_images for the parameter contract.

Source code in src/terok_executor/container/build.py
def build_base(
    self,
    *,
    agents: str | tuple[str, ...] = "all",
    rebuild: bool = False,
    full_rebuild: bool = False,
    build_dir: Path | None = None,
    tag_as_default: bool = False,
) -> ImageSet:
    """Build L0 + L1 images for *agents*; returns the resulting tag pair.

    See module-level ``build_base_images`` for the parameter contract.
    """
    return build_base_images(
        self.base_image,
        family=self.family,
        agents=agents,
        rebuild=rebuild,
        full_rebuild=full_rebuild,
        build_dir=build_dir,
        tag_as_default=tag_as_default,
    )

build_sidecar(*, tool_name='coderabbit', rebuild=False, full_rebuild=False, build_dir=None)

Build the L1 sidecar image variant for tool_name; returns the tag.

Source code in src/terok_executor/container/build.py
def build_sidecar(
    self,
    *,
    tool_name: str = "coderabbit",
    rebuild: bool = False,
    full_rebuild: bool = False,
    build_dir: Path | None = None,
) -> str:
    """Build the L1 sidecar image variant for *tool_name*; returns the tag."""
    return build_sidecar_image(
        self.base_image,
        family=self.family,
        tool_name=tool_name,
        rebuild=rebuild,
        full_rebuild=full_rebuild,
        build_dir=build_dir,
    )

ensure_default_l1(agents='all')

Return the default-alias L1 tag, building the default L1 if absent.

Source code in src/terok_executor/container/build.py
def ensure_default_l1(self, agents: str | tuple[str, ...] = "all") -> str:
    """Return the default-alias L1 tag, building the default L1 if absent."""
    return ensure_default_l1(self.base_image, family=self.family, agents=agents)

l1_tag(agents=None)

L1 image tag for agents under self.base_image (alias when None).

Source code in src/terok_executor/container/build.py
def l1_tag(self, agents: tuple[str, ...] | None = None) -> str:
    """L1 image tag for *agents* under ``self.base_image`` (alias when ``None``)."""
    return l1_image_tag(self.base_image, agents)

render_l0()

Render the L0 Dockerfile for this base.

Instance-bound because L0 is anchored on self.base_image.

Source code in src/terok_executor/container/build.py
def render_l0(self) -> str:
    """Render the L0 Dockerfile for this base.

    Instance-bound because L0 is anchored on ``self.base_image``.
    """
    return render_l0(self.base_image, family=self._family)

render_l1(l0_tag, *, family, agents='all', cache_bust='0') staticmethod

Render the L1 CLI Dockerfile for agents on top of l0_tag.

Static because the L1 stage depends only on the L0 tag and the package family — it never touches self.base_image. Callers that already have an ImageBuilder can pass builder._family to thread the resolved family through.

Source code in src/terok_executor/container/build.py
@staticmethod
def render_l1(
    l0_tag: str,
    *,
    family: str,
    agents: tuple[str, ...] | str = "all",
    cache_bust: str = "0",
) -> str:
    """Render the L1 CLI Dockerfile for *agents* on top of *l0_tag*.

    Static because the L1 stage depends only on the L0 tag and the
    package family — it never touches ``self.base_image``.  Callers
    that already have an [`ImageBuilder`][terok_executor.container.build.ImageBuilder]
    can pass ``builder._family`` to thread the resolved family through.
    """
    return render_l1(l0_tag, family=family, agents=agents, cache_bust=cache_bust)

render_l1_sidecar(l0_tag, *, family, tool_name='coderabbit', cache_bust='0') staticmethod

Render the L1 sidecar Dockerfile for tool_name on top of l0_tag.

Static for the same reason as render_l1.

Source code in src/terok_executor/container/build.py
@staticmethod
def render_l1_sidecar(
    l0_tag: str,
    *,
    family: str,
    tool_name: str = "coderabbit",
    cache_bust: str = "0",
) -> str:
    """Render the L1 sidecar Dockerfile for *tool_name* on top of *l0_tag*.

    Static for the same reason as [`render_l1`][terok_executor.container.build.ImageBuilder.render_l1].
    """
    return render_l1_sidecar(l0_tag, family=family, tool_name=tool_name, cache_bust=cache_bust)

detect_family(base_image, override=None) staticmethod

Resolve the package family ("deb" / "rpm") for base_image.

Source code in src/terok_executor/container/build.py
@staticmethod
def detect_family(base_image: str, override: str | None = None) -> str:
    """Resolve the package family (``"deb"`` / ``"rpm"``) for *base_image*."""
    return detect_family(base_image, override)

image_agents(image) staticmethod

Return roster agent names from an L1 image's ai.terok.agents label.

Source code in src/terok_executor/container/build.py
@staticmethod
def image_agents(image: str) -> set[str]:
    """Return roster agent names from an L1 image's ``ai.terok.agents`` label."""
    return image_agents(image)

stage_scripts(dest) staticmethod

Stage shell helper scripts (hilfe, terok-*) into dest.

Source code in src/terok_executor/container/build.py
@staticmethod
def stage_scripts(dest: Path) -> None:
    """Stage shell helper scripts (``hilfe``, ``terok-*``) into *dest*."""
    stage_scripts(dest)

stage_tmux_config(dest) staticmethod

Stage the tmux config into dest.

Source code in src/terok_executor/container/build.py
@staticmethod
def stage_tmux_config(dest: Path) -> None:
    """Stage the tmux config into *dest*."""
    stage_tmux_config(dest)

stage_toad_agents(dest) staticmethod

Stage toad agent metadata into dest.

Source code in src/terok_executor/container/build.py
@staticmethod
def stage_toad_agents(dest: Path) -> None:
    """Stage toad agent metadata into *dest*."""
    stage_toad_agents(dest)

ImageSet(l0, l1, l1_sidecar=None) dataclass

L0 + L1 image tags produced by a build.

l0 instance-attribute

L0 base dev image tag (e.g. terok-l0:fedora-44).

l1 instance-attribute

L1 agent CLI image tag (e.g. terok-l1-cli:fedora-44).

l1_sidecar = None class-attribute instance-attribute

L1 sidecar image tag, if built (e.g. terok-l1-sidecar:fedora-44).

ContainerEnvSpec(task_id, provider_name, workspace_host_path, code_repo=None, clone_from=None, branch=None, git_author_name=None, git_author_email=None, git_committer_name=None, git_committer_email=None, authorship='agent', human_name='Nobody', human_email='nobody@localhost', credential_scope='standalone', credential_set='default', vault_transport='direct', vault_required=False, scan_leaked_creds=False, enabled_vault_patch_providers=None, disabled_vault_patch_providers=None, expose_credential_providers=frozenset(), unrestricted=True, timezone=None, agent_config_dir=None, shared_dir=None, shared_mount='/shared', task_dir=None, envs_dir=None, extra_volumes=()) dataclass

Specification for container environment assembly.

All fields use primitives or Path — no terok-specific types. Callers pre-resolve domain-specific decisions (security class, authorship mode, SSH mount, gate mirror creation) and pass results here.

task_id instance-attribute

Unique task identifier.

provider_name instance-attribute

Agent provider name (e.g. "claude", "codex").

workspace_host_path instance-attribute

Host-side workspace directory — caller pre-creates, mounted as /workspace:Z.

code_repo = None class-attribute instance-attribute

Git URL to clone inside the container (→ CODE_REPO).

clone_from = None class-attribute instance-attribute

Secondary clone source for online-mode gate optimization (→ CLONE_FROM).

branch = None class-attribute instance-attribute

Git branch to check out (→ GIT_BRANCH).

git_author_name = None class-attribute instance-attribute

Resolved from roster provider if None.

git_author_email = None class-attribute instance-attribute

git_committer_name = None class-attribute instance-attribute

git_committer_email = None class-attribute instance-attribute

authorship = 'agent' class-attribute instance-attribute

Authorship mode consumed by in-container wrappers (→ TEROK_GIT_AUTHORSHIP).

human_name = 'Nobody' class-attribute instance-attribute

Human operator name (→ HUMAN_GIT_NAME). terok resolves from project config / git config; standalone uses the default or --git-identity-from-host.

human_email = 'nobody@localhost' class-attribute instance-attribute

Human operator email (→ HUMAN_GIT_EMAIL).

credential_scope = 'standalone' class-attribute instance-attribute

Scope for vault token creation. terok passes project.id.

credential_set = 'default' class-attribute instance-attribute

Vault storage namespace to read credentials from. Pairs with Authenticator.run's credential_set — if the auth flow stored a token under set foo, the runtime must read from set foo too or the container will see empty env. Default "default" matches the shared host-wide bucket every standalone caller uses; terok overrides for per-project credentials.

vault_transport = 'direct' class-attribute instance-attribute

Vault transport mode: "direct" (HTTP base URL) or "socket" (Unix socket path via socket_env).

vault_required = False class-attribute instance-attribute

When True, raise SystemExit if the vault is unreachable. When False (default), soft-fail to empty env.

scan_leaked_creds = False class-attribute instance-attribute

When True, scan shared mounts for real credential files and emit warnings. Standalone mode defaults to off; terok enables this.

enabled_vault_patch_providers = None class-attribute instance-attribute

Provider subset whose shared config patches should be applied.

None means "all providers with patches". An empty set disables vault config patching entirely. terok uses this to gate experimental OAuth routing without affecting standalone executor defaults.

disabled_vault_patch_providers = None class-attribute instance-attribute

Provider subset whose previously managed config patch values should be removed if still owned by terok. None removes nothing.

expose_credential_providers = frozenset() class-attribute instance-attribute

Providers whose credential file should remain writable in-container.

By default every provider with a vault.credential_file gets the file mounted read-only on top of its shared config dir, so an in-container /login cannot taint the host copy (terok-ai/terok#873). Providers in this set keep the writable bind — used by terok's experimental expose_oauth_token mode where the agent intentionally manages its own token.

unrestricted = True class-attribute instance-attribute

Enable auto-approve flags for all agents.

timezone = None class-attribute instance-attribute

IANA timezone name propagated to the container as TZ.

None (the default) means detect the host's timezone via terok_executor._util.detect_host_timezone — the container then follows the host. Pass an explicit string ("UTC", "Europe/Prague") to override, including to pin the container to UTC for reproducible runs. If neither detection nor an override yields a zone, TZ is not set and the image default applies.

agent_config_dir = None class-attribute instance-attribute

Pre-prepared agent config directory (→ /home/dev/.terok:Z).

shared_dir = None class-attribute instance-attribute

Host-side shared directory. Created by the assembly function if set.

shared_mount = '/shared' class-attribute instance-attribute

Container-side mount point for the shared directory.

task_dir = None class-attribute instance-attribute

Host-side task directory. A temp dir is created if None.

envs_dir = None class-attribute instance-attribute

Base directory for shared config mounts. Uses paths.mounts_dir if None.

extra_volumes = () class-attribute instance-attribute

Additional volume specs from the caller (e.g. SSH mounts from terok).

AgentRunner(*, sandbox=None, runtime=None, roster=None, base_image='fedora:44', family=None, cfg=None)

Composes sandbox + agent config into a single container launch.

All three run methods follow the same flow:

  1. Ensure L0+L1 images exist (build if missing)
  2. Prepare agent-config directory (wrapper, instructions, prompt)
  3. Assemble environment variables and volume mounts
  4. Optionally set up gate (mirror repo, create token)
  5. Launch container via podman
Source code in src/terok_executor/container/runner.py
def __init__(
    self,
    *,
    sandbox: Sandbox | None = None,
    runtime: ContainerRuntime | None = None,
    roster: AgentRoster | None = None,
    base_image: str = "fedora:44",
    family: str | None = None,
    cfg: SandboxConfig | None = None,
) -> None:
    if sandbox is not None and runtime is not None and sandbox.runtime is not runtime:
        # Split backends would mean port reservations on one runtime
        # get used by containers launched via a different runtime —
        # a subtle class of bug (``run_web`` vs ``sandbox.run``) that
        # is easier to rule out at construction time.
        raise ValueError(
            "AgentRunner: sandbox.runtime and runtime must be the same backend "
            "instance; pass only one or ensure sandbox was constructed with runtime"
        )
    self._base_image = base_image
    self._family = family
    self._sandbox: Sandbox | None = sandbox
    self._runtime: ContainerRuntime | None = runtime
    self._roster: AgentRoster | None = roster
    self._cfg: SandboxConfig | None = cfg

sandbox property

Lazy-init sandbox facade.

When an explicit runtime was supplied but no sandbox, the sandbox is constructed with that same runtime so the two share one backend instance.

runtime property

Return the container runtime used for observation and lifecycle.

Falls back to the sandbox's runtime when the caller did not supply one — keeps the two in sync by construction.

roster property

Lazy-init agent roster.

run_headless(provider, repo, *, prompt, branch=None, model=None, max_turns=None, timeout=1800, gate=True, name=None, follow=False, unrestricted=True, gpu=False, memory=None, cpus=None, hooks=None, human_name=None, human_email=None, authorship=None, shared_dir=None, shared_mount='/shared', timezone=None, project_id='', task_id='', dossier_path=None)

Launch a headless agent run. Returns container name.

The agent executes the prompt against repo (local path or git URL) and exits when done or when timeout is reached. Set follow=True to block until the agent finishes (the CLI does this by default).

project_id, task_id, dossier_path propagate the terok orchestrator's identity into the per-container supervisor sidecar. Defaults preserve the standalone-executor case (no terok above).

Source code in src/terok_executor/container/runner.py
def run_headless(
    self,
    provider: str,
    repo: str,
    *,
    prompt: str,
    branch: str | None = None,
    model: str | None = None,
    max_turns: int | None = None,
    timeout: int = 1800,
    gate: bool = True,
    name: str | None = None,
    follow: bool = False,
    unrestricted: bool = True,
    gpu: bool = False,
    memory: str | None = None,
    cpus: str | None = None,
    hooks: LifecycleHooks | None = None,
    human_name: str | None = None,
    human_email: str | None = None,
    authorship: str | None = None,
    shared_dir: Path | None = None,
    shared_mount: str = "/shared",
    timezone: str | None = None,
    project_id: str = "",
    task_id: str = "",
    dossier_path: Path | str | None = None,
) -> str:
    """Launch a headless agent run. Returns container name.

    The agent executes the *prompt* against *repo* (local path or git URL)
    and exits when done or when *timeout* is reached.  Set *follow=True*
    to block until the agent finishes (the CLI does this by default).

    *project_id*, *task_id*, *dossier_path* propagate the terok
    orchestrator's identity into the per-container supervisor sidecar.
    Defaults preserve the standalone-executor case (no terok above).
    """
    return self._run(
        provider=provider,
        repo=repo,
        prompt=prompt,
        branch=branch,
        model=model,
        max_turns=max_turns,
        timeout=timeout,
        gate=gate,
        name=name,
        follow=follow,
        mode="headless",
        unrestricted=unrestricted,
        gpu=gpu,
        memory=memory,
        cpus=cpus,
        hooks=hooks,
        human_name=human_name,
        human_email=human_email,
        authorship=authorship,
        shared_dir=shared_dir,
        shared_mount=shared_mount,
        timezone=timezone,
        project_id=project_id,
        supervisor_task_id=task_id,
        dossier_path=dossier_path,
    )

run_interactive(provider, repo, *, branch=None, gate=True, name=None, unrestricted=True, gpu=False, memory=None, cpus=None, hooks=None, human_name=None, human_email=None, authorship=None, shared_dir=None, shared_mount='/shared', timezone=None, project_id='', task_id='', dossier_path=None)

Launch an interactive container. Returns container name.

The container stays up after init; user logs in via podman exec.

See run_headless for the project_id / task_id / dossier_path semantics.

Source code in src/terok_executor/container/runner.py
def run_interactive(
    self,
    provider: str,
    repo: str,
    *,
    branch: str | None = None,
    gate: bool = True,
    name: str | None = None,
    unrestricted: bool = True,
    gpu: bool = False,
    memory: str | None = None,
    cpus: str | None = None,
    hooks: LifecycleHooks | None = None,
    human_name: str | None = None,
    human_email: str | None = None,
    authorship: str | None = None,
    shared_dir: Path | None = None,
    shared_mount: str = "/shared",
    timezone: str | None = None,
    project_id: str = "",
    task_id: str = "",
    dossier_path: Path | str | None = None,
) -> str:
    """Launch an interactive container. Returns container name.

    The container stays up after init; user logs in via ``podman exec``.

    See [`run_headless`][terok_executor.container.runner.AgentRunner.run_headless]
    for the *project_id* / *task_id* / *dossier_path* semantics.
    """
    return self._run(
        provider=provider,
        repo=repo,
        branch=branch,
        gate=gate,
        name=name,
        mode="interactive",
        unrestricted=unrestricted,
        gpu=gpu,
        memory=memory,
        cpus=cpus,
        hooks=hooks,
        human_name=human_name,
        human_email=human_email,
        authorship=authorship,
        shared_dir=shared_dir,
        shared_mount=shared_mount,
        timezone=timezone,
        project_id=project_id,
        supervisor_task_id=task_id,
        dossier_path=dossier_path,
    )

run_web(repo, *, port=None, branch=None, gate=True, name=None, public_url=None, unrestricted=True, gpu=False, memory=None, cpus=None, hooks=None, human_name=None, human_email=None, authorship=None, shared_dir=None, shared_mount='/shared', timezone=None, project_id='', task_id='', dossier_path=None)

Launch a toad web container. Returns container name.

If port is None, an available port is auto-allocated.

See run_headless for the project_id / task_id / dossier_path semantics.

Source code in src/terok_executor/container/runner.py
def run_web(
    self,
    repo: str,
    *,
    port: int | None = None,
    branch: str | None = None,
    gate: bool = True,
    name: str | None = None,
    public_url: str | None = None,
    unrestricted: bool = True,
    gpu: bool = False,
    memory: str | None = None,
    cpus: str | None = None,
    hooks: LifecycleHooks | None = None,
    human_name: str | None = None,
    human_email: str | None = None,
    authorship: str | None = None,
    shared_dir: Path | None = None,
    shared_mount: str = "/shared",
    timezone: str | None = None,
    project_id: str = "",
    task_id: str = "",
    dossier_path: Path | str | None = None,
) -> str:
    """Launch a toad web container. Returns container name.

    If *port* is None, an available port is auto-allocated.

    See [`run_headless`][terok_executor.container.runner.AgentRunner.run_headless]
    for the *project_id* / *task_id* / *dossier_path* semantics.
    """
    if port is None:
        with self.runtime.reserve_port() as reservation:
            port = reservation.port
    return self._run(
        provider="claude",  # toad uses claude as default
        repo=repo,
        branch=branch,
        gate=gate,
        name=name,
        mode="web",
        port=port,
        public_url=public_url,
        unrestricted=unrestricted,
        gpu=gpu,
        memory=memory,
        cpus=cpus,
        hooks=hooks,
        human_name=human_name,
        human_email=human_email,
        authorship=authorship,
        shared_dir=shared_dir,
        shared_mount=shared_mount,
        timezone=timezone,
        project_id=project_id,
        supervisor_task_id=task_id,
        dossier_path=dossier_path,
    )

run_tool(tool, repo, *, tool_args=(), branch=None, gate=True, name=None, follow=True, timeout=600, timezone=None, project_id='', task_id='', dossier_path=None)

Launch a sidecar tool container. Returns container name.

Runs the named tool in a lightweight sidecar L1 image (no agent CLIs). The tool receives the real API key from the credential store — not a phantom token.

See run_headless for the project_id / task_id / dossier_path semantics.

Source code in src/terok_executor/container/runner.py
def run_tool(
    self,
    tool: str,
    repo: str,
    *,
    tool_args: tuple[str, ...] = (),
    branch: str | None = None,
    gate: bool = True,
    name: str | None = None,
    follow: bool = True,
    timeout: int = 600,
    timezone: str | None = None,
    project_id: str = "",
    task_id: str = "",
    dossier_path: Path | str | None = None,
) -> str:
    """Launch a sidecar tool container. Returns container name.

    Runs the named tool in a lightweight sidecar L1 image (no agent
    CLIs).  The tool receives the real API key from the credential
    store — not a phantom token.

    See [`run_headless`][terok_executor.container.runner.AgentRunner.run_headless]
    for the *project_id* / *task_id* / *dossier_path* semantics.
    """
    return self._run(
        provider=tool,
        repo=repo,
        mode="tool",
        gate=gate,
        name=name,
        follow=follow,
        timeout=timeout,
        tool_args=tool_args,
        branch=branch,
        timezone=timezone,
        project_id=project_id,
        supervisor_task_id=task_id,
        dossier_path=dossier_path,
    )

launch_prepared(*, env, volumes, image, command, name, task_dir, gpu=False, memory=None, cpus=None, unrestricted=True, sealed=False, hooks=None, extra_args=None, hostname=None, annotations=None, runtime=None, project_id='', task_id='', dossier_path=None, per_container=None)

Launch a container from a caller-prepared env, volumes, image, and command.

Use this when the caller has already assembled the environment and volume specs — e.g. the terok orchestrator, which computes project-specific env via build_task_env_and_volumes and owns the container naming policy. For end-to-end runs from a repo and prompt (CLI-style), use run_headless, run_interactive, or run_web instead.

In sealed isolation mode (sealed=True), the sandbox splits the launch into createcopy_tostart instead of a single run — no host↔container bind mounts remain after startup.

Parameters:

Name Type Description Default
env dict[str, str]

Environment variables injected into the container.

required
volumes list[VolumeSpec]

Host↔container directory specs (sandbox decides mount vs inject).

required
image str

Image tag to run.

required
command list[str]

Command + args to execute as PID 1.

required
name str

Container name (must be unique on the host).

required
task_dir Path

Per-task directory used for per-container shield state.

required
gpu bool

Pass GPU device args when True.

False
memory str | None

Podman --memory value ("4g" etc.); None = unlimited.

None
cpus str | None

Podman --cpus value ("2.0" etc.); None = unlimited.

None
unrestricted bool

When False, adds --security-opt no-new-privileges.

True
sealed bool

Enable sealed isolation (no bind mounts).

False
hooks LifecycleHooks | None

Optional lifecycle callbacks fired around the launch.

None
extra_args list[str] | None

Additional raw podman run flags (e.g. port publishing).

None
hostname str | None

Override the in-container hostname (podman --hostname). When None (default), podman assigns the short container ID.

None
annotations Mapping[str, str] | None

OCI annotations forwarded as --annotation k=v; validated against SAFE_ANNOTATION_KEYS. Typed channel for orchestrator metadata the shield reads, distinct from the freeform extra_args.

None
runtime str | None

OCI runtime selector forwarded to RunSpec.runtime. None (default) leaves the choice to podman; "krun" selects the libkrun microVM backend and also drives shield's dnsmasq bind selection. Prefer this over passing --runtime via extra_args — sandbox emits the flag itself and shield reads the value to pick the right firewall topology.

None
project_id str

Identity written into the per-container supervisor sidecar so the supervisor can scope its state to the calling terok project. Default "" preserves the standalone-executor case where no terok orchestrator sits above the runner.

''
task_id str

Per-task identity written into the supervisor sidecar alongside project_id. Default "" for the standalone case.

''
dossier_path Path | str | None

Path to the per-task dossier file the shield reads at container start. Default None omits the field from the sidecar — only orchestrated runs carry a dossier.

None
per_container PerContainerResources | None

Pre-allocated per-container socket dir / TCP ports. When provided, the launch uses these instead of allocating its own — so a caller that already threaded the same instance through env assembly (assemble_container_env) keeps the vault-routing env vars and the supervisor binding on identical ports. Default None allocates internally (the standalone path, and external callers that assemble env without per-container routing).

None

Returns:

Type Description
str

The container name (same as name).

Raises:

Type Description
BuildError

When GPU was requested but the host has no functioning NVIDIA CDI.

Source code in src/terok_executor/container/runner.py
def launch_prepared(
    self,
    *,
    env: dict[str, str],
    volumes: list[VolumeSpec],
    image: str,
    command: list[str],
    name: str,
    task_dir: Path,
    gpu: bool = False,
    memory: str | None = None,
    cpus: str | None = None,
    unrestricted: bool = True,
    sealed: bool = False,
    hooks: LifecycleHooks | None = None,
    extra_args: list[str] | None = None,
    hostname: str | None = None,
    annotations: Mapping[str, str] | None = None,
    runtime: str | None = None,
    project_id: str = "",
    task_id: str = "",
    dossier_path: Path | str | None = None,
    per_container: PerContainerResources | None = None,
) -> str:
    """Launch a container from a caller-prepared env, volumes, image, and command.

    Use this when the caller has already assembled the environment and
    volume specs — e.g. the terok orchestrator, which computes
    project-specific env via ``build_task_env_and_volumes`` and owns
    the container naming policy.  For end-to-end runs from a repo and
    prompt (CLI-style), use [`run_headless`][terok_executor.container.runner.AgentRunner.run_headless], [`run_interactive`][terok_executor.container.runner.AgentRunner.run_interactive],
    or [`run_web`][terok_executor.container.runner.AgentRunner.run_web] instead.

    In sealed isolation mode (*sealed=True*), the sandbox splits the
    launch into ``create`` → ``copy_to`` → ``start`` instead of a
    single ``run`` — no host↔container bind mounts remain after startup.

    Args:
        env: Environment variables injected into the container.
        volumes: Host↔container directory specs (sandbox decides mount vs inject).
        image: Image tag to run.
        command: Command + args to execute as PID 1.
        name: Container name (must be unique on the host).
        task_dir: Per-task directory used for per-container shield state.
        gpu: Pass GPU device args when True.
        memory: Podman ``--memory`` value (``"4g"`` etc.); ``None`` = unlimited.
        cpus: Podman ``--cpus`` value (``"2.0"`` etc.); ``None`` = unlimited.
        unrestricted: When False, adds ``--security-opt no-new-privileges``.
        sealed: Enable sealed isolation (no bind mounts).
        hooks: Optional lifecycle callbacks fired around the launch.
        extra_args: Additional raw ``podman run`` flags (e.g. port publishing).
        hostname: Override the in-container hostname (podman ``--hostname``).
            When ``None`` (default), podman assigns the short container ID.
        annotations: OCI annotations forwarded as ``--annotation k=v``;
            validated against
            [`SAFE_ANNOTATION_KEYS`][terok_sandbox.sandbox.SAFE_ANNOTATION_KEYS].
            Typed channel for orchestrator metadata the shield reads,
            distinct from the freeform *extra_args*.
        runtime: OCI runtime selector forwarded to
            [`RunSpec.runtime`][terok_sandbox.sandbox.RunSpec.runtime].
            ``None`` (default) leaves the choice to podman; ``"krun"``
            selects the libkrun microVM backend and also drives
            shield's dnsmasq bind selection.  Prefer this over
            passing ``--runtime`` via *extra_args* — sandbox emits
            the flag itself and shield reads the value to pick the
            right firewall topology.
        project_id: Identity written into the per-container
            supervisor sidecar so the supervisor can scope its
            state to the calling terok project.  Default ``""``
            preserves the standalone-executor case where no terok
            orchestrator sits above the runner.
        task_id: Per-task identity written into the supervisor
            sidecar alongside *project_id*.  Default ``""`` for
            the standalone case.
        dossier_path: Path to the per-task dossier file the
            shield reads at container start.  Default ``None``
            omits the field from the sidecar — only orchestrated
            runs carry a dossier.
        per_container: Pre-allocated per-container socket dir / TCP
            ports.  When provided, the launch uses these instead of
            allocating its own — so a caller that already threaded
            the same instance through env assembly
            ([`assemble_container_env`][terok_executor.container.env.assemble_container_env])
            keeps the vault-routing env vars and the supervisor
            binding on identical ports.  Default ``None`` allocates
            internally (the standalone path, and external callers
            that assemble env without per-container routing).

    Returns:
        The container name (same as *name*).

    Raises:
        BuildError: When GPU was requested but the host has no functioning
            NVIDIA CDI.
    """
    from terok_executor.integrations.sandbox import (
        GpuConfigError,
        RunSpec,
        Sharing,
        VolumeSpec,
        allocate_per_container_resources,
    )

    from .sidecar import write_supervisor_sidecar

    cfg = self.sandbox.config

    # Per-container socket dir / TCP ports.  Allocated here so the
    # mount, the env vars the in-container bridge reads, and the
    # sidecar JSON the supervisor reads all see the same values —
    # the only path that keeps concurrent containers from colliding
    # on the singletons baked into ``cfg``.  When the caller already
    # allocated one (``_run`` threads it through env assembly too),
    # reuse that instance so the vault-routing env vars and this
    # binding land on identical ports — a second allocation here
    # would hand back different ports and re-introduce the TCP-mode
    # cross-container collision.
    if per_container is None:
        per_container = allocate_per_container_resources(cfg, name)

    # Bind-mount the per-container socket dir at /run/terok/.  The
    # supervisor's later-bound vault.sock + ssh-agent.sock surface
    # inside the container via this single mount (instead of two
    # singleton file-mounts that two containers would collide on).
    env = dict(env)
    volumes = list(volumes)
    volumes.append(
        VolumeSpec(
            per_container.container_runtime_dir,
            "/run/terok",
            sharing=Sharing.SHARED,
            live=True,
        )
    )
    # TCP-mode env vars carry the per-container port, not the
    # host-singleton ``cfg.token_broker_port`` — the launch flow
    # routes through the per-container ports only.
    if cfg.services_mode == "tcp":
        if per_container.token_broker_port is not None:
            env["TEROK_TOKEN_BROKER_PORT"] = str(per_container.token_broker_port)
        if per_container.ssh_signer_port is not None:
            env["TEROK_SSH_SIGNER_PORT"] = str(per_container.ssh_signer_port)
        if per_container.gate_port is not None:
            env["TEROK_GATE_PORT"] = str(per_container.gate_port)

    # The gate is wired when the prepared env carries a gate token
    # (set by ``_setup_gate`` / the orchestrator).  When active, the
    # supervisor needs the mirror base path, the token, and — in TCP
    # mode — the port to serve the gate in-process.
    gate_active = "TEROK_GATE_TOKEN" in env

    # Write the per-container supervisor sidecar before podman run.
    # The terok-sandbox OCI hook installed by ``terok-sandbox setup``
    # reads this file on container start and spawns one supervisor
    # per container; without it the supervisor refuses to start.
    sidecar_path = write_supervisor_sidecar(
        name,
        cfg=cfg,
        per_container=per_container,
        project_id=project_id,
        task_id=task_id,
        dossier_path=dossier_path,
        gate_base_path=str(cfg.gate_base_path) if gate_active else None,
        gate_token=env["TEROK_GATE_TOKEN"] if gate_active else None,
        gate_port=per_container.gate_port if gate_active else None,
    )
    # Fail closed: a missing sidecar means the supervisor OCI hook
    # never fires, so the container would launch with no vault,
    # clearance, or signer behind it. Refuse the launch rather than
    # run unsupervised.
    if sidecar_path is None:
        raise BuildError(
            f"supervisor sidecar write failed for {name}; refusing to launch "
            "an unsupervised container (no vault/clearance/signer)"
        )
    # The supervisor OCI hook fires only when this annotation is
    # present (matched by ``when.annotations`` in the hook
    # descriptor) and reads its value as the sidecar location —
    # no XDG guessing, one anchor.
    spec_annotations = dict(annotations or {})
    spec_annotations["terok.sandbox.sidecar"] = str(sidecar_path)

    loopback_ports = tuple(
        p
        for p in (
            per_container.gate_port,
            per_container.token_broker_port,
            per_container.ssh_signer_port,
        )
        if p is not None
    )

    spec = RunSpec(
        container_name=name,
        image=image,
        env=env,
        volumes=tuple(volumes),
        command=tuple(command),
        task_dir=task_dir,
        gpu_enabled=gpu,
        memory=memory,
        cpus=cpus,
        extra_args=tuple(extra_args or ()),
        unrestricted=unrestricted,
        sealed=sealed,
        hostname=hostname,
        annotations=spec_annotations,
        runtime=runtime,
        loopback_ports=loopback_ports,
    )

    try:
        self.sandbox.run(spec, hooks=hooks)
    except GpuConfigError as exc:
        raise BuildError(str(exc)) from exc

    return name

wait_for_exit(container_name, timeout=None)

Block until container_name exits; return its exit code.

Raises TimeoutError when timeout elapses before the container exits — signalled out of band so a container that legitimately exits with code 124 (the timeout(1) convention) is returned unambiguously as its real exit code, not conflated with the wait timing out.

Raises RuntimeError when podman wait itself fails (non-zero returncode, e.g. unknown container) or returns output that is not a container exit code — the podman error is never impersonated as the container's exit code, which would let a "no such container" diagnostic leak out as exit code 125.

Raises FileNotFoundError when podman is not on PATH. Intentionally re-implements the wait loop instead of delegating to Sandbox.wait_for_exit, which swallows subprocess.TimeoutExpired and returns the 124 sentinel — fine for fire-and-forget generic waits, lossy for task-level callers that need to record the real exit code.

Source code in src/terok_executor/container/runner.py
def wait_for_exit(
    self,
    container_name: str,
    timeout: float | None = None,
) -> int:
    """Block until *container_name* exits; return its exit code.

    Raises [`TimeoutError`][TimeoutError] when *timeout* elapses before the
    container exits — signalled out of band so a container that
    legitimately exits with code 124 (the ``timeout(1)`` convention)
    is returned unambiguously as its real exit code, not conflated
    with the wait timing out.

    Raises [`RuntimeError`][RuntimeError] when ``podman wait`` itself fails
    (non-zero returncode, e.g. unknown container) or returns output
    that is not a container exit code — the podman error is never
    impersonated as the container's exit code, which would let a
    "no such container" diagnostic leak out as exit code 125.

    Raises [`FileNotFoundError`][FileNotFoundError] when ``podman`` is not on PATH.
    Intentionally re-implements the wait loop instead of delegating
    to `Sandbox.wait_for_exit`, which swallows
    [`subprocess.TimeoutExpired`][subprocess.TimeoutExpired] and returns the 124 sentinel
    — fine for fire-and-forget generic waits, lossy for task-level
    callers that need to record the real exit code.
    """
    import subprocess

    try:
        proc = subprocess.run(
            ["podman", "wait", container_name],
            check=False,
            capture_output=True,
            text=True,
            timeout=timeout,
        )
    except subprocess.TimeoutExpired as exc:
        raise TimeoutError(
            f"container {container_name!r} did not exit within {timeout}s"
        ) from exc

    if proc.returncode != 0:
        detail = (proc.stderr or proc.stdout or "").strip() or "<no output>"
        raise RuntimeError(
            f"podman wait {container_name!r} failed (returncode={proc.returncode}): {detail}"
        )

    stdout = (proc.stdout or "").strip()
    try:
        return int(stdout)
    except ValueError as exc:
        raise RuntimeError(
            f"podman wait {container_name!r} returned unexpected output: "
            f"stdout={proc.stdout!r}, stderr={proc.stderr!r}"
        ) from exc

logs(container_name, *, tail=None, timestamps=False, since=None)

Return the container's logged output as a single string.

One-shot retrieval for the "just show me what ran" case. For live streaming (human watching), use stream_logs_process; for archival, use capture_logs.

Raises RuntimeError when podman logs returns a non-zero status (e.g. unknown container) — the diagnostic is surfaced rather than impersonated as empty output. FileNotFoundError propagates when podman is not on PATH.

Source code in src/terok_executor/container/runner.py
def logs(
    self,
    container_name: str,
    *,
    tail: int | None = None,
    timestamps: bool = False,
    since: str | None = None,
) -> str:
    """Return the container's logged output as a single string.

    One-shot retrieval for the "just show me what ran" case.  For live
    streaming (human watching), use [`stream_logs_process`][terok_executor.container.runner.AgentRunner.stream_logs_process]; for
    archival, use [`capture_logs`][terok_executor.container.runner.AgentRunner.capture_logs].

    Raises [`RuntimeError`][RuntimeError] when ``podman logs`` returns a non-zero
    status (e.g. unknown container) — the diagnostic is surfaced rather
    than impersonated as empty output.  [`FileNotFoundError`][FileNotFoundError]
    propagates when ``podman`` is not on PATH.
    """
    import subprocess

    cmd = _build_logs_cmd(container_name, tail=tail, timestamps=timestamps, since=since)
    proc = subprocess.run(cmd, capture_output=True, text=True, check=False)
    if proc.returncode != 0:
        detail = (proc.stderr or proc.stdout or "").strip() or "<no output>"
        raise RuntimeError(
            f"podman logs {container_name!r} failed (returncode={proc.returncode}): {detail}"
        )
    return (proc.stdout or "") + (proc.stderr or "")

capture_logs(container_name, dest, *, timestamps=True, timeout=60.0)

Capture a container's logs to dest; return True on success.

Streams stdout directly to dest (bytes) so large logs do not need to fit in memory. Used at task-archive time to freeze the container's output onto the host filesystem before removal.

On any failure — missing podman, podman error, timeout — dest is removed and False is returned so the caller sees one signal, not a partially-written file.

Source code in src/terok_executor/container/runner.py
def capture_logs(
    self,
    container_name: str,
    dest: Path,
    *,
    timestamps: bool = True,
    timeout: float = 60.0,
) -> bool:
    """Capture a container's logs to *dest*; return ``True`` on success.

    Streams stdout directly to *dest* (bytes) so large logs do not need
    to fit in memory.  Used at task-archive time to freeze the
    container's output onto the host filesystem before removal.

    On any failure — missing podman, podman error, timeout — *dest* is
    removed and ``False`` is returned so the caller sees one signal,
    not a partially-written file.
    """
    import subprocess

    cmd = _build_logs_cmd(container_name, timestamps=timestamps)
    try:
        with dest.open("wb") as f:
            proc = subprocess.run(
                cmd,
                stdout=f,
                stderr=subprocess.PIPE,
                timeout=timeout,
                check=False,
            )
    except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
        dest.unlink(missing_ok=True)
        return False

    if proc.returncode != 0:
        dest.unlink(missing_ok=True)
        return False
    return True

stream_logs_process(container_name, *, follow=False, tail=None, timestamps=False, merge_stderr=False)

Spawn a long-running podman logs process; return the Popen.

The raw subprocess handle is exposed deliberately: live-log consumers (TUI log viewer, interactive task logs -f) need fd-level control — select() between reads, SIGINT handling, stop-event polling — that a higher-level iterator abstraction would hide badly. Every current caller's event loop already looks like select([proc.stdout], …) → read1() so returning the Popen matches existing patterns instead of fighting them.

Caller owns the subprocess. Typical pattern::

proc = runner.stream_logs_process(cname, follow=True)
try:
    for chunk in iter(proc.stdout.read1, b""):
        ...
finally:
    proc.terminate()
    proc.wait()

When merge_stderr is True, stderr is folded into stdout (matches subprocess.STDOUT); otherwise stderr is a separate pipe the caller can drain.

FileNotFoundError propagates when podman is not on PATH — callers handle it (usually as a user-facing "podman not installed" error).

Source code in src/terok_executor/container/runner.py
def stream_logs_process(
    self,
    container_name: str,
    *,
    follow: bool = False,
    tail: int | None = None,
    timestamps: bool = False,
    merge_stderr: bool = False,
) -> subprocess.Popen[bytes]:
    """Spawn a long-running ``podman logs`` process; return the ``Popen``.

    The raw subprocess handle is exposed deliberately: live-log
    consumers (TUI log viewer, interactive ``task logs -f``) need
    fd-level control — ``select()`` between reads, SIGINT handling,
    stop-event polling — that a higher-level iterator abstraction
    would hide badly.  Every current caller's event loop already looks
    like ``select([proc.stdout], …) → read1()`` so returning the
    ``Popen`` matches existing patterns instead of fighting them.

    Caller owns the subprocess.  Typical pattern::

        proc = runner.stream_logs_process(cname, follow=True)
        try:
            for chunk in iter(proc.stdout.read1, b""):
                ...
        finally:
            proc.terminate()
            proc.wait()

    When *merge_stderr* is True, stderr is folded into stdout
    (matches ``subprocess.STDOUT``); otherwise stderr is a separate
    pipe the caller can drain.

    [`FileNotFoundError`][FileNotFoundError] propagates when ``podman`` is not on
    PATH — callers handle it (usually as a user-facing "podman not
    installed" error).
    """
    import subprocess

    cmd = _build_logs_cmd(container_name, follow=follow, tail=tail, timestamps=timestamps)
    return subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT if merge_stderr else subprocess.PIPE,
    )

Authenticator(provider) dataclass

Vendor-credential acquisition for a single agent.

Wraps the authenticate flow behind a stable class so callers that orchestrate a multi-step setup (terok project init, the standalone terok-executor auth command, the TUI auth flow) talk to one named surface bound to self.provider.

The discovery counterparts (list_authenticated_agents, scan_leaked_credentials) stay as module-level fns in their owning submodules — folding them in here would create a tach cycle through terok_executor.acp and terok_executor.credentials.vault_commands, which already depend on this module transitively.

provider instance-attribute

Auth provider name (e.g. "claude").

run(project_id, *, mounts_dir, image=None, expose_token=False, oauth_enabled=True, credential_set='default')

Run the auth flow for self.provider; see module-level docs.

Mirrors the parameters of the underlying authenticate free function — instance-bound self.provider replaces the old positional provider arg.

Source code in src/terok_executor/credentials/auth.py
def run(
    self,
    project_id: str | None,
    *,
    mounts_dir: Path,
    image: str | Callable[[], str] | None = None,
    expose_token: bool = False,
    oauth_enabled: bool = True,
    credential_set: str = "default",
) -> None:
    """Run the auth flow for ``self.provider``; see module-level docs.

    Mirrors the parameters of the underlying ``authenticate`` free
    function — instance-bound ``self.provider`` replaces the old
    positional ``provider`` arg.
    """
    authenticate(
        project_id,
        self.provider,
        mounts_dir=mounts_dir,
        image=image,
        expose_token=expose_token,
        oauth_enabled=oauth_enabled,
        credential_set=credential_set,
    )

prepare_oauth(project_id, *, mounts_dir, image, expose_token=False, credential_set='default')

Build an AuthSession without running it.

Frontends that own their own UI loop (e.g. the terok Textual TUI, which wants to dispatch the OAuth container into a new terminal tab or via tmux instead of inline) build the session here, run session.argv however they like, then call session.capture() on success. The CLI's blocking authenticate path is just another such caller — see _run_auth_container.

Source code in src/terok_executor/credentials/auth.py
def prepare_oauth(
    self,
    project_id: str | None,
    *,
    mounts_dir: Path,
    image: str,
    expose_token: bool = False,
    credential_set: str = "default",
) -> AuthSession:
    """Build an [`AuthSession`][terok_executor.AuthSession] without running it.

    Frontends that own their own UI loop (e.g. the terok Textual TUI,
    which wants to dispatch the OAuth container into a new terminal
    tab or via tmux instead of inline) build the session here, run
    ``session.argv`` however they like, then call ``session.capture()``
    on success.  The CLI's blocking ``authenticate`` path is just
    another such caller — see ``_run_auth_container``.
    """
    info = AUTH_PROVIDERS.get(self.provider)
    if not info:
        available = ", ".join(AUTH_PROVIDERS)
        raise SystemExit(f"Unknown auth provider: {self.provider}. Available: {available}")
    if not info.supports_oauth:
        raise SystemExit(
            f"Provider {self.provider!r} does not support OAuth — use store_api_key() instead."
        )
    return prepare_oauth_session(
        info,
        project_id,
        mounts_dir=mounts_dir,
        image=image,
        expose_token=expose_token,
        credential_set=credential_set,
    )

AuthSession(provider, project_id, container_name, argv, banner, auth_dir, mounts_dir, credential_set='default', expose_token=False, _tmpdir=None) dataclass

A prepared-but-not-run OAuth auth container session.

Built by Authenticator.prepare_oauth (or the module-level prepare_oauth_session helper). Hold-don't-call: the caller is responsible for running argv (synchronously, in a new terminal tab, suspended TUI, etc.) and calling capture() afterwards. Use as a context manager so the temp dir and any dangling container are cleaned up on exit.

provider instance-attribute

Provider descriptor (label, banner hint, mount points).

project_id instance-attribute

Project scope for the banner; None for host-wide auth.

container_name instance-attribute

Podman container name (used for cleanup and -it log clarity).

argv instance-attribute

The podman run … command line — run this however you like.

banner instance-attribute

Banner text to display before launching argv.

auth_dir instance-attribute

Temp dir bind-mounted as the container's auth config target.

Lives until cleanup() (or __exit__). Credential extraction in capture() reads from here, so don't remove it manually.

mounts_dir instance-attribute

Base directory for the shared post-capture mount (OAuth providers only).

credential_set = 'default' class-attribute instance-attribute

Which credential set in the vault DB receives the captured token.

expose_token = False class-attribute instance-attribute

When True, real credential files are copied into the shared mount (tier 3).

title property

Short human-readable title ("Authenticating Claude (host-wide)").

capture()

Extract credentials from auth_dir, store them in the vault DB.

Call after argv exits successfully. Safe to call multiple times (the underlying extractor is idempotent on a stable credential file).

Source code in src/terok_executor/credentials/auth.py
def capture(self) -> None:
    """Extract credentials from ``auth_dir``, store them in the vault DB.

    Call after ``argv`` exits successfully.  Safe to call multiple
    times (the underlying extractor is idempotent on a stable
    credential file).
    """
    _capture_credentials(
        self.provider.name,
        self.auth_dir,
        self.credential_set,
        mounts_base=self.mounts_dir,
        auth_provider=self.provider,
        expose_token=self.expose_token,
    )

cleanup()

Release the temp dir and force-remove any lingering container.

Idempotent. __exit__ calls this automatically.

Source code in src/terok_executor/credentials/auth.py
def cleanup(self) -> None:
    """Release the temp dir and force-remove any lingering container.

    Idempotent.  ``__exit__`` calls this automatically.
    """
    subprocess.run(
        ["podman", "rm", "-f", self.container_name],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
        check=False,
    )
    if self._tmpdir is not None:
        self._tmpdir.cleanup()
        self._tmpdir = None

__enter__()

Return self; the heavy lifting already happened in the factory.

Source code in src/terok_executor/credentials/auth.py
def __enter__(self) -> AuthSession:
    """Return self; the heavy lifting already happened in the factory."""
    return self

__exit__(*_exc)

Run cleanup on context-manager exit.

Source code in src/terok_executor/credentials/auth.py
def __exit__(self, *_exc: object) -> None:
    """Run ``cleanup`` on context-manager exit."""
    self.cleanup()

KrunHost(*, cfg=None)

Host-side krun launch context — vault keypair + runtime + launch args.

One instance per launch. The first access to keypair opens the vault DB and materialises the %host private/public keypair to tmpfs; every subsequent access on the same instance reuses the cached result, so calling both runtime and launch_args pays that cost only once.

Requires the vault to be unlocked — the krun runtime is gated on experimental: true upstream and assumes the operator has the vault open for the session. A NoPassphraseError from the underlying vault open propagates unchanged so the orchestrator can render its own remediation hint.

Parameters:

Name Type Description Default
cfg SandboxConfig | None

Sandbox config used to open the credential DB. None means use the zero-arg default — appropriate for standalone executor flows; terok injects its own enriched config when calling.

None

Bind a krun launch to cfg; the keypair is loaded on first access.

Source code in src/terok_executor/krun.py
def __init__(self, *, cfg: SandboxConfig | None = None):
    """Bind a krun launch to *cfg*; the keypair is loaded on first access."""
    self._cfg = cfg
    self._keypair: KrunHostKeypair | None = None

__slots__ = ('_cfg', '_keypair') class-attribute instance-attribute

keypair property

Vault-backed %host keypair materialised to tmpfs (cached).

First access opens the vault DB and writes the OpenSSH-PEM private + public-key line to a tmpfs cache directory; subsequent accesses on the same instance return the same object without reopening the vault.

runtime()

Construct a production KrunRuntime in one call.

Wires together the three production pieces — the cached host keypair, the TCP-over-passt SSH transport, and a fresh PodmanRuntime for lifecycle. The experimental-flag gate stays on the orchestrator side (this factory is reachable only when the gate is open).

Source code in src/terok_executor/krun.py
def runtime(self) -> KrunRuntime:
    """Construct a production [`KrunRuntime`][terok_sandbox.KrunRuntime] in one call.

    Wires together the three production pieces — the cached host
    keypair, the TCP-over-passt SSH transport, and a fresh
    [`PodmanRuntime`][terok_sandbox.PodmanRuntime] for lifecycle.
    The experimental-flag gate stays on the orchestrator side (this
    factory is reachable only when the gate is open).
    """
    kp = self.keypair
    transport = TcpSSHTransport(
        identity_file=kp.private_path,
        endpoint_resolver=podman_port_resolver(),
    )
    return KrunRuntime(transport=transport, podman=PodmanRuntime())

launch_args()

Extra podman run args terok must splice in for a krun launch.

Four things that all reach across the orchestrator/runtime boundary into executor's domain — the L0 image, the host keypair, the in-guest init-ssh-and-repo.sh, and the DNS forwarder address — so they live here together rather than being open-coded in terok's _project_runtime_flags:

  • Bind-mount the live host pubkey over the L0's empty placeholder at /etc/ssh/authorized_keys.d/terok. z is the shared SELinux relabel (never Z — the host pubkey is host-wide and concurrent containers share the source).
  • Set TEROK_CONTAINER_RUNTIME=krun so the init script's krun gate fires.
  • Override the L0's USER dev directive with --user root so the in-guest sshd can start, listen on TCP 22, and drop to the authenticated user on connection. USER dev is the right default under crun (AI agents that refuse uid 0); under krun the session uid comes from which ssh user@… the operator picks.
  • --dns 169.254.1.1 — kept for shield-bypass; under shield-up the bind-mounted resolv.conf overrides this anyway.

Doesn't include --runtime krun itself or krun's microVM-sizing annotations — those are orchestrator-level decisions terok keeps.

Source code in src/terok_executor/krun.py
def launch_args(self) -> list[str]:
    """Extra ``podman run`` args terok must splice in for a krun launch.

    Four things that all reach across the orchestrator/runtime
    boundary into executor's domain — the L0 image, the host
    keypair, the in-guest ``init-ssh-and-repo.sh``, and the DNS
    forwarder address — so they live here together rather than
    being open-coded in terok's ``_project_runtime_flags``:

    - Bind-mount the live host pubkey over the L0's empty
      placeholder at ``/etc/ssh/authorized_keys.d/terok``.  ``z`` is
      the shared SELinux relabel (never ``Z`` — the host pubkey is
      host-wide and concurrent containers share the source).
    - Set ``TEROK_CONTAINER_RUNTIME=krun`` so the init script's krun
      gate fires.
    - Override the L0's ``USER dev`` directive with ``--user root``
      so the in-guest sshd can start, listen on TCP 22, and drop to
      the authenticated user on connection.  ``USER dev`` is the
      right default under crun (AI agents that refuse uid 0); under
      krun the session uid comes from which ``ssh user@…`` the
      operator picks.
    - ``--dns 169.254.1.1`` — kept for shield-bypass; under
      shield-up the bind-mounted resolv.conf overrides this anyway.

    Doesn't include ``--runtime krun`` itself or krun's microVM-sizing
    annotations — those are orchestrator-level decisions terok keeps.
    """
    kp = self.keypair
    return [
        "-v",
        f"{kp.public_path}:/etc/ssh/authorized_keys.d/terok:ro,z",
        "-e",
        "TEROK_CONTAINER_RUNTIME=krun",
        "--user",
        "root",
        "--dns",
        _PASTA_DNS_FORWARDER,
    ]

KrunHostKeypair(private_path, public_path, public_line, fingerprint, created) dataclass

Materialised view of the %host infrastructure keypair.

Returned by ensure_krun_host_keypair. Carries the tmpfs path to the OpenSSH-PEM private key (ready for ssh -i) and the matching public-key file (ready to bind-mount into the krun guest at /etc/ssh/authorized_keys.d/terok), so callers don't have to redo the DER→PEM conversion or re-derive the public line from raw blobs.

private_path instance-attribute

tmpfs path holding the OpenSSH-PEM private key (0600 perms).

public_path instance-attribute

Sibling .pub file (0644 perms) carrying the public line.

public_line instance-attribute

Single-line OpenSSH public key (ssh-ed25519 AAAA… comment).

fingerprint instance-attribute

Canonical SHA256:… fingerprint over the SSH wire-format blob.

created instance-attribute

True when this call minted the key; False when it was loaded.

AgentConfigSpec(tasks_root, task_id, subagents, selected_agents=None, prompt=None, provider='claude', instructions=None, default_agent=None, mounts_base=None) dataclass

Groups parameters for preparing an agent-config directory.

tasks_root instance-attribute

task_id instance-attribute

subagents instance-attribute

selected_agents = None class-attribute instance-attribute

prompt = None class-attribute instance-attribute

provider = 'claude' class-attribute instance-attribute

instructions = None class-attribute instance-attribute

default_agent = None class-attribute instance-attribute

mounts_base = None class-attribute instance-attribute

__post_init__()

Coerce mutable sequences to tuples for true immutability.

Defensive against callers that build the spec from json.loads / yaml.load output where the runtime types are list instead of tuple. Mypy sees the static annotations and reports the isinstance(..., list) branches as unreachable; the runtime coercion remains correct.

Source code in src/terok_executor/provider/agents.py
def __post_init__(self) -> None:
    """Coerce mutable sequences to tuples for true immutability.

    Defensive against callers that build the spec from
    ``json.loads`` / ``yaml.load`` output where the runtime types are
    ``list`` instead of ``tuple``.  Mypy sees the static annotations
    and reports the ``isinstance(..., list)`` branches as unreachable;
    the runtime coercion remains correct.
    """
    if isinstance(self.subagents, list):  # type: ignore[unreachable]
        object.__setattr__(self, "subagents", tuple(self.subagents))  # type: ignore[unreachable]
    if isinstance(self.selected_agents, list):
        object.__setattr__(self, "selected_agents", tuple(self.selected_agents))  # type: ignore[unreachable]

AgentProvider(name, label, binary, git_author_name, git_author_email, headless_subcommand, prompt_flag, auto_approve_env, auto_approve_flags, output_format_flags, model_flag, max_turns_flag, verbose_flag, supports_session_resume, resume_flag, continue_flag, session_file, supports_agents_json, supports_session_hook, supports_add_dir, log_format, opencode_config=None, refuse_subcommands=()) dataclass

Describes how to run one AI coding agent (all modes: interactive + headless).

name instance-attribute

Short key used in CLI dispatch (e.g. "claude", "codex").

label instance-attribute

Human-readable display name (e.g. "Claude", "Codex").

binary instance-attribute

CLI binary name (e.g. "claude", "codex", "opencode").

git_author_name instance-attribute

AI identity name for Git author/committer policy application.

git_author_email instance-attribute

AI identity email for Git author/committer policy application.

headless_subcommand instance-attribute

Subcommand for headless mode (e.g. "exec" for codex, "run" for opencode).

None means the binary uses flags only (e.g. claude -p).

prompt_flag instance-attribute

Flag for passing the prompt.

"-p" for flag-based, "" for positional (after subcommand).

auto_approve_env instance-attribute

Environment variables for fully autonomous execution.

Injected into the container env by _apply_unrestricted_env() when TEROK_UNRESTRICTED=1. Read by agents regardless of launch path. Claude uses /etc/claude-code/managed-settings.json instead.

auto_approve_flags instance-attribute

CLI flags injected by the shell wrapper when TEROK_UNRESTRICTED=1.

Only for agents that lack an env var or managed config mechanism (currently Codex only). Empty for all other agents — their env vars and /etc/ config files handle permissions across all launch paths.

output_format_flags instance-attribute

Flags for structured output (e.g. ("--output-format", "stream-json")).

model_flag instance-attribute

Flag for model override ("--model", "--agent", or None).

max_turns_flag instance-attribute

Flag for maximum turns ("--max-turns" or None).

verbose_flag instance-attribute

Flag for verbose output ("--verbose" or None).

supports_session_resume instance-attribute

Whether the provider supports resuming a previous session.

resume_flag instance-attribute

Flag to resume a session (e.g. "--resume", "--session").

continue_flag instance-attribute

Flag to continue a session (e.g. "--continue").

session_file instance-attribute

Filename in /home/dev/.terok/ for stored session ID.

Providers that capture session IDs via plugin or post-run parsing set this to a filename (e.g. "opencode-session.txt"). Providers with their own hook mechanism (Claude) or no session support set this to None.

supports_agents_json instance-attribute

Whether the provider supports --agents JSON (Claude only).

supports_session_hook instance-attribute

Whether the provider supports SessionStart hooks (Claude only).

supports_add_dir instance-attribute

Whether the provider supports --add-dir "/" (Claude only).

log_format instance-attribute

Log format identifier: "claude-stream-json" or "plain".

opencode_config = None class-attribute instance-attribute

Configuration for OpenCode-based providers (Blablador, KISSKI, etc.).

When set, this provider uses OpenCode with a custom OpenAI-compatible API. The configuration includes API endpoints, model preferences, and provider-specific settings that are injected into the container environment.

refuse_subcommands = () class-attribute instance-attribute

Subcommands the in-container wrapper refuses with a friendly error.

Used to block credential-handling flows (login, logout, setup-token) that would otherwise pollute the host-shared mount — operators authenticate on the host via terok auth instead. Best effort only; the firewall is the actual enforcement (terok-ai/terok#873).

uses_opencode_instructions property

Whether the provider uses OpenCode's instruction system.

apply_config(config, overrides=None)

Resolve config values for this provider with best-effort feature mapping.

CLI flag overrides take precedence over config values. When this provider lacks a feature, an analogue is used where possible (e.g. injecting max-turns guidance into the prompt), and a warning is emitted for features that have no analogue.

Source code in src/terok_executor/provider/providers.py
def apply_config(
    self,
    config: dict[str, Any],
    overrides: CLIOverrides | None = None,
) -> ProviderConfig:
    """Resolve config values for this provider with best-effort feature mapping.

    CLI flag *overrides* take precedence over *config* values.  When this
    provider lacks a feature, an analogue is used where possible (e.g.
    injecting max-turns guidance into the prompt), and a warning is
    emitted for features that have no analogue.
    """
    if overrides is None:
        overrides = CLIOverrides()

    warnings: list[str] = []
    prompt_parts: list[str] = []

    # --- Model ---
    cfg_model = resolve_provider_value("model", config, self.name)
    model = overrides.model or (str(cfg_model) if cfg_model is not None else None)
    if model and not self.model_flag:
        warnings.append(
            f"{self.label} does not support model selection; ignoring model={model!r}"
        )
        model = None

    # --- Max turns ---
    cfg_turns = resolve_provider_value("max_turns", config, self.name)
    max_turns_raw = overrides.max_turns if overrides.max_turns is not None else cfg_turns
    max_turns: int | None = int(max_turns_raw) if max_turns_raw is not None else None
    if max_turns is not None and not self.max_turns_flag:
        # Best-effort: inject into prompt as guidance
        prompt_parts.append(f"Important: complete this task in no more than {max_turns} steps.")
        warnings.append(
            f"{self.label} does not support --max-turns; "
            f"added guidance to prompt instead ({max_turns} steps)"
        )
        max_turns = None

    # --- Timeout ---
    cfg_timeout = resolve_provider_value("timeout", config, self.name)
    timeout = (
        overrides.timeout
        if overrides.timeout is not None
        else (int(cfg_timeout) if cfg_timeout is not None else 1800)
    )

    # --- Subagents (warning only — filtering is handled elsewhere) ---
    subagents = config.get("subagents")
    if subagents and not self.supports_agents_json:
        warnings.append(
            f"{self.label} does not support sub-agents (--agents); "
            f"sub-agent definitions will be ignored"
        )

    # --- Instructions ---
    # Claude receives instructions via --append-system-prompt in the wrapper.
    # Codex receives instructions via -c model_instructions_file=... in the wrapper.
    # OpenCode-based providers receive instructions via opencode.json `instructions`
    # array (injected by prepare_agent_config_dir).
    # Remaining providers get best-effort prompt prepending.
    instructions = overrides.instructions
    if (
        instructions
        and self.name not in {"claude", "codex"}
        and not self.uses_opencode_instructions
    ):
        prompt_parts.insert(0, instructions)

    return ProviderConfig(
        model=model,
        max_turns=max_turns,
        timeout=timeout,
        prompt_extra="\n".join(prompt_parts),
        warnings=tuple(warnings),
    )

build_headless_command(*, timeout, model=None, max_turns=None)

Assemble the bash command string for a headless agent run.

The command assumes:

  • init-ssh-and-repo.sh has already set up the workspace
  • The prompt is in /home/dev/.terok/prompt.txt
  • For Claude, the claude() wrapper function is sourced via bash -l

Returns a bash command string suitable for ["bash", "-lc", cmd]. Dispatches to provider-specific assembly: Claude routes through the shell wrapper (which adds --add-dir, --agents, git env); everything else uses the generic shape with subcommand + flags.

Source code in src/terok_executor/provider/providers.py
def build_headless_command(
    self,
    *,
    timeout: int,
    model: str | None = None,
    max_turns: int | None = None,
) -> str:
    """Assemble the bash command string for a headless agent run.

    The command assumes:

    - ``init-ssh-and-repo.sh`` has already set up the workspace
    - The prompt is in ``/home/dev/.terok/prompt.txt``
    - For Claude, the ``claude()`` wrapper function is sourced via ``bash -l``

    Returns a bash command string suitable for ``["bash", "-lc", cmd]``.
    Dispatches to provider-specific assembly: Claude routes through the
    shell wrapper (which adds ``--add-dir``, ``--agents``, git env);
    everything else uses the generic shape with subcommand + flags.
    """
    if self.name == "claude":
        return self._build_claude_command(timeout=timeout, model=model, max_turns=max_turns)
    return self._build_generic_command(timeout=timeout, model=model, max_turns=max_turns)

CLIOverrides(model=None, max_turns=None, timeout=None, instructions=None) dataclass

CLI flag overrides for a headless agent run.

model = None class-attribute instance-attribute

Explicit --model from CLI (takes precedence over config).

max_turns = None class-attribute instance-attribute

Explicit --max-turns from CLI.

timeout = None class-attribute instance-attribute

Explicit --timeout from CLI.

instructions = None class-attribute instance-attribute

Resolved instructions text. Delivery is provider-aware.

AgentRoster(_providers=dict(), _auth_providers=dict(), _vault_routes=dict(), _sidecar_specs=dict(), _installs=dict(), _helps=dict(), _mounts=(), _agent_names=(), _all_names=(), _web_ingress=frozenset()) dataclass

Queryable view over the loaded set of agents and tools.

Returned by load_roster; grouped accessors expose providers, auth providers, vault routes, sidecar specs, install snippets, and help blurbs by name.

providers property

All headless agent providers (kind: agent only).

auth_providers property

All auth providers (agents + tools with auth: section).

vault_routes property

All vault routes, keyed by provider name.

sidecar_specs property

All sidecar tool specs, keyed by tool name.

agent_names property

Names of kind: agent entries (for CLI completion).

all_names property

Names of all entries (agents + tools).

installs property

All install specs, keyed by roster name (entries without one are absent).

helps property

All help blurbs, keyed by roster name (entries without one are absent).

web_ingress property

Names of entries that publish a host HTTP port (web_ingress: true).

Consumers (e.g. terok's task launcher) use this to decide whether to allocate a published port and drop a per-task auth token into the container-visible config dir.

mounts property

All shared directory mounts (auth dirs + explicit mounts: sections).

Deduplicated by host_dir — if auth and mounts define the same directory, only one entry is returned.

resolve_selection(selection)

Resolve a user-supplied selection into the full set of roster names to install.

Accepts the literal string "all" (every roster entry that has an InstallSpec) or a tuple of selection tokens. Each token is either a roster name (include) or a name prefixed with - (exclude). The pseudo-name "all" is also valid as an include token, meaning "seed from every installable entry"; this combines naturally with excludes, e.g. ("all", "-vibe") installs everything except vibe. When no include tokens are present (only excludes), the seed is the full roster.

Includes are expanded transitively via depends_on before excludes are applied, so an exclude that names a dependency of a kept agent will silently drop that dependency — likely producing a broken image, but matching the user's literal request.

Returns the names sorted alphabetically — the canonical order used for the OCI label, the tag suffix, and the in-container manifest.

Raises ValueError if a requested include or exclude name is not in the roster, or TypeError if selection is a string other than "all" (a bare name like "claude" would otherwise be iterated into characters). Excludes that name a known agent but don't appear in the resolved include set are a no-op.

Source code in src/terok_executor/roster/loader.py
def resolve_selection(self, selection: str | tuple[str, ...]) -> tuple[str, ...]:
    """Resolve a user-supplied selection into the full set of roster names to install.

    Accepts the literal string ``"all"`` (every roster entry that has an
    [`InstallSpec`][terok_executor.roster.types.InstallSpec]) or a tuple of
    selection tokens.  Each token is either a roster name (include) or a
    name prefixed with ``-`` (exclude).  The pseudo-name ``"all"`` is also
    valid as an include token, meaning "seed from every installable
    entry"; this combines naturally with excludes, e.g. ``("all",
    "-vibe")`` installs everything except vibe.  When no include tokens
    are present (only excludes), the seed is the full roster.

    Includes are expanded transitively via ``depends_on`` *before*
    excludes are applied, so an exclude that names a dependency of a
    kept agent will silently drop that dependency — likely producing a
    broken image, but matching the user's literal request.

    Returns the names sorted alphabetically — the canonical order used
    for the OCI label, the tag suffix, and the in-container manifest.

    Raises ``ValueError`` if a requested include or exclude name is not
    in the roster, or ``TypeError`` if *selection* is a string other
    than ``"all"`` (a bare name like ``"claude"`` would otherwise be
    iterated into characters).  Excludes that name a known agent but
    don't appear in the resolved include set are a no-op.
    """
    if isinstance(selection, str):
        if selection != "all":
            raise TypeError(
                f"Selection must be the literal string 'all' or a tuple of "
                f"tokens, got {selection!r}"
            )
        return tuple(sorted(self._installs))

    includes = {t for t in selection if not t.startswith("-")}
    excludes = {t[1:] for t in selection if t.startswith("-")}

    referenced = (includes | excludes) - {"all"}
    unknown = referenced - set(self._installs)
    if unknown:
        avail = ", ".join(sorted(self._installs))
        raise ValueError(f"Unknown roster entries: {sorted(unknown)!r}. Available: {avail}")

    seed = set(self._installs) if "all" in includes or not includes else includes

    resolved: set[str] = set()
    stack = list(seed)
    while stack:
        name = stack.pop()
        if name in resolved:
            continue
        resolved.add(name)
        spec = self._installs.get(name)
        if spec is None:
            continue
        for dep in spec.depends_on:
            if dep not in self._installs:
                raise ValueError(
                    f"Agent {name!r} declares depends_on {dep!r}, "
                    f"which has no install: section in the roster"
                )
            if dep not in resolved:
                stack.append(dep)
    return tuple(sorted(resolved - excludes))

get_provider(name, *, default_agent=None)

Resolve a provider name to an AgentProvider.

Falls back to default_agent, then "claude". Raises SystemExit if the resolved name is unknown.

Source code in src/terok_executor/roster/loader.py
def get_provider(self, name: str | None, *, default_agent: str | None = None) -> AgentProvider:
    """Resolve a provider name to an ``AgentProvider``.

    Falls back to *default_agent*, then ``"claude"``.
    Raises ``SystemExit`` if the resolved name is unknown.
    """
    from terok_executor.provider.providers import resolve_provider

    return resolve_provider(self._providers, name, default_agent=default_agent)

get_auth_provider(name)

Look up an auth provider by name.

Raises SystemExit if the name is unknown.

Source code in src/terok_executor/roster/loader.py
def get_auth_provider(self, name: str) -> AuthProvider:
    """Look up an auth provider by name.

    Raises ``SystemExit`` if the name is unknown.
    """
    info = self._auth_providers.get(name)
    if info is None:
        available = ", ".join(sorted(self._auth_providers))
        raise SystemExit(f"Unknown auth provider: {name!r}. Available: {available}")
    return info

get_sidecar_spec(name)

Look up a sidecar spec by tool name.

Raises SystemExit if the name has no sidecar configuration.

Source code in src/terok_executor/roster/loader.py
def get_sidecar_spec(self, name: str) -> SidecarSpec:
    """Look up a sidecar spec by tool name.

    Raises ``SystemExit`` if the name has no sidecar configuration.
    """
    spec = self._sidecar_specs.get(name)
    if spec is None:
        available = ", ".join(sorted(self._sidecar_specs)) or "(none)"
        raise SystemExit(f"No sidecar config for {name!r}. Available: {available}")
    return spec

generate_routes_json()

Generate the routes.json content for the sandbox vault server.

Returns a JSON object mapping provider name → VaultRouteEntry with empty/absent optional fields stripped.

Source code in src/terok_executor/roster/loader.py
def generate_routes_json(self) -> str:
    """Generate the ``routes.json`` content for the sandbox vault server.

    Returns a JSON object mapping provider name → [`VaultRouteEntry`][terok_executor.roster.schema.VaultRouteEntry]
    with empty/absent optional fields stripped.
    """
    from pydantic import TypeAdapter

    routes: dict[str, VaultRouteEntry] = {}
    prefix_owners: dict[str, str] = {}
    for route in self._vault_routes.values():
        existing = prefix_owners.get(route.route_prefix)
        if existing is not None:
            raise ValueError(
                f"Duplicate route prefix {route.route_prefix!r}: "
                f"providers {existing!r} and {route.provider!r}"
            )
        prefix_owners[route.route_prefix] = route.provider
        routes[route.provider] = VaultRouteEntry(
            upstream=route.upstream,
            auth_header=route.auth_header,
            auth_prefix=route.auth_prefix,
            path_upstreams=route.path_upstreams or None,
            oauth_extra_headers=route.oauth_extra_headers or None,
            oauth_refresh=route.oauth_refresh or None,
        )
    return (
        TypeAdapter(dict[str, VaultRouteEntry])
        .dump_json(routes, indent=2, exclude_none=True)
        .decode()
    )

collect_all_auto_approve_env()

Merge auto_approve.env from all providers into one dict.

Source code in src/terok_executor/roster/loader.py
def collect_all_auto_approve_env(self) -> dict[str, str]:
    """Merge ``auto_approve.env`` from all providers into one dict."""
    merged: dict[str, str] = {}
    for p in self._providers.values():
        for key, value in p.auto_approve_env.items():
            if key in merged and merged[key] != value:
                raise ValueError(
                    f"Conflicting auto_approve_env for {key!r}: "
                    f"{merged[key]!r} vs {value!r} (provider {p.name!r})"
                )
            merged[key] = value
    return merged

collect_opencode_provider_env()

Collect env vars for all OpenCode-based providers.

Source code in src/terok_executor/roster/loader.py
def collect_opencode_provider_env(self) -> dict[str, str]:
    """Collect env vars for all OpenCode-based providers."""
    env: dict[str, str] = {}
    for p in self._providers.values():
        if p.opencode_config is not None:
            env.update(p.opencode_config.to_env(p.name))
    return env

shared() staticmethod

Return the process-wide cached roster.

Loaded on first access; every subsequent call returns the same instance. Use this from anywhere that just needs the global view; tests that mutate or replace the roster should call load_roster and keep the result local.

Source code in src/terok_executor/roster/loader.py
@staticmethod
def shared() -> AgentRoster:
    """Return the process-wide cached roster.

    Loaded on first access; every subsequent call returns the same
    instance.  Use this from anywhere that just needs the global
    view; tests that mutate or replace the roster should call
    [`load_roster`][terok_executor.roster.loader.load_roster] and
    keep the result local.
    """
    return _shared_roster()

parse_selection(raw) staticmethod

Normalise a user-supplied agent selection string.

Accepts a comma-list of selection tokens or the literal "all". Each token is either an agent name ("claude") or a name prefixed with - to exclude it from the selection ("-vibe"). The pseudo-name "all" is also valid as a token, so "all,-vibe" means "everything except vibe". When the input contains only excludes ("-vibe"), the selection seeds from every installable entry — same effect as "all,-vibe".

Whitespace is stripped, empty / whitespace-only entries dropped, and case folded. Empty or all-whitespace input collapses to "all" — the same shape AgentRoster.resolve_selection expects. Unknown names are not checked here; resolve_selection does that.

Source code in src/terok_executor/roster/loader.py
@staticmethod
def parse_selection(raw: str) -> str | tuple[str, ...]:
    """Normalise a user-supplied agent selection string.

    Accepts a comma-list of selection tokens or the literal ``"all"``.
    Each token is either an agent name (``"claude"``) or a name
    prefixed with ``-`` to exclude it from the selection
    (``"-vibe"``).  The pseudo-name ``"all"`` is also valid as a
    token, so ``"all,-vibe"`` means "everything except vibe".  When
    the input contains only excludes (``"-vibe"``), the selection
    seeds from every installable entry — same effect as
    ``"all,-vibe"``.

    Whitespace is stripped, empty / whitespace-only entries dropped,
    and case folded.  Empty or all-whitespace input collapses to
    ``"all"`` — the same shape
    [`AgentRoster.resolve_selection`][terok_executor.roster.loader.AgentRoster.resolve_selection]
    expects.  Unknown names are not checked here;
    ``resolve_selection`` does that.
    """
    folded = raw.strip().lower()
    if folded == "all" or not folded:
        return "all"
    tokens = tuple(n.strip() for n in folded.split(",") if n.strip())
    return tokens or "all"

validate_selection(raw)

Reject raw with SystemExit(2) if it names roster entries we don't have.

CLI-flavoured: prints a Invalid agent selection: … line on stderr and exits. Domain callers that just want the parsed tuple should use parse_selection + resolve_selection and handle ValueError themselves.

Source code in src/terok_executor/roster/loader.py
def validate_selection(self, raw: str) -> None:
    """Reject *raw* with ``SystemExit(2)`` if it names roster entries we don't have.

    CLI-flavoured: prints a ``Invalid agent selection: …`` line on
    stderr and exits.  Domain callers that just want the parsed
    tuple should use
    [`parse_selection`][terok_executor.roster.loader.AgentRoster.parse_selection]
    + [`resolve_selection`][terok_executor.roster.loader.AgentRoster.resolve_selection]
    and handle ``ValueError`` themselves.
    """
    try:
        self.resolve_selection(self.parse_selection(raw))
    except ValueError as exc:
        print(f"Invalid agent selection: {exc}", file=sys.stderr)
        raise SystemExit(2) from exc

prompt_selection()

Print the installed roster and read one line of executor grammar.

Empty input → "all". Non-interactive stdin (closed pipe) exits with a hint to pass the selection positionally instead.

Source code in src/terok_executor/roster/loader.py
def prompt_selection(self) -> str:
    """Print the installed roster and read one line of executor grammar.

    Empty input → ``"all"``.  Non-interactive stdin (closed pipe)
    exits with a hint to pass the selection positionally instead.
    """
    providers = self.providers
    print("\nAvailable agents:")
    for name in sorted(self.agent_names):
        provider = providers.get(name)
        label = provider.label if provider is not None else name
        print(f"  · {name}{label}")
    try:
        raw = input("\nType a comma list, or '-name' to exclude [all]: ").strip()
    except EOFError as exc:
        raise SystemExit(
            "No interactive stdin available.  Pass the selection positionally "
            "instead, e.g. `terok agents set all`."
        ) from exc
    return raw or "all"

ensure_vault_routes(cfg=None)

Generate routes.json from this roster and write it to disk.

The routes file is written to the path configured in SandboxConfig (typically ~/.local/share/terok/vault/routes.json).

When cfg is None, falls back to standalone defaults.

Returns the path to the written file.

Source code in src/terok_executor/roster/loader.py
def ensure_vault_routes(self, cfg: SandboxConfig | None = None) -> Path:
    """Generate ``routes.json`` from this roster and write it to disk.

    The routes file is written to the path configured in
    [`SandboxConfig`][terok_sandbox.SandboxConfig] (typically
    ``~/.local/share/terok/vault/routes.json``).

    When *cfg* is ``None``, falls back to standalone defaults.

    Returns the path to the written file.
    """
    if cfg is None:
        cfg = SandboxConfig()
    path = cfg.routes_path

    path.parent.mkdir(parents=True, exist_ok=True)
    content = self.generate_routes_json() + "\n"
    fd, tmp_name = tempfile.mkstemp(prefix=f".{path.name}.", dir=path.parent)
    tmp = Path(tmp_name)
    try:
        with os.fdopen(fd, "w", encoding="utf-8") as f:
            f.write(content)
            f.flush()
            os.fsync(f.fileno())
        tmp.replace(path)
    except BaseException:
        tmp.unlink(missing_ok=True)
        raise
    return path

doctor_checks(*, token_broker_port=None)

Return agent-level health checks for in-container diagnostics.

Delegates to terok_executor.doctor for the actual check factories; this method is the canonical entry point so consumers can discover the checks through the roster.

Parameters:

Name Type Description Default
token_broker_port int | None

Host-side vault broker TCP port. None selects socket mode; any integer selects TCP mode. Base URL checks use the port (or the in-container loopback port) to derive the expected host.

None
Source code in src/terok_executor/roster/loader.py
def doctor_checks(self, *, token_broker_port: int | None = None) -> list[DoctorCheck]:
    """Return agent-level health checks for in-container diagnostics.

    Delegates to
    [`terok_executor.doctor`][terok_executor.doctor] for the actual
    check factories; this method is the canonical entry point so
    consumers can discover the checks through the roster.

    Args:
        token_broker_port: Host-side vault broker TCP port.  ``None``
            selects socket mode; any integer selects TCP mode.  Base
            URL checks use the port (or the in-container loopback
            port) to derive the expected host.
    """
    from terok_executor.doctor import _build_agent_doctor_checks

    return _build_agent_doctor_checks(self, token_broker_port=token_broker_port)

SharedMountStorageInfo(name, label, bytes) dataclass

Disk usage for one shared config mount directory.

name instance-attribute

label instance-attribute

bytes instance-attribute

measure_all(mounts_base=None) classmethod

Measure each shared config mount directory.

Labels come from the agent roster when available, falling back to a title-cased version of the directory name.

Source code in src/terok_executor/storage.py
@classmethod
def measure_all(cls, mounts_base: Path | None = None) -> list[SharedMountStorageInfo]:
    """Measure each shared config mount directory.

    Labels come from the agent roster when available, falling back to
    a title-cased version of the directory name.
    """
    base = mounts_base or mounts_dir()
    if not base.is_dir():
        return []

    roster_mounts = AgentRoster.shared().mounts
    return sorted(
        (
            cls(
                name=d.name,
                label=_mount_label(d.name, roster_mounts),
                bytes=_dir_bytes(d),
            )
            for d in base.iterdir()
            if d.is_dir()
        ),
        key=lambda m: m.name,
    )

TaskStorageInfo(task_id, workspace_bytes, agent_config_bytes) dataclass

Disk usage snapshot for a single task's host directories.

task_id instance-attribute

workspace_bytes instance-attribute

agent_config_bytes instance-attribute

total_bytes property

Combined footprint of workspace and agent config.

measure(task_dir) classmethod

Measure a single task's disk footprint.

Expects the standard layout: <task_dir>/workspace-dangerous/ for the agent-writable code, <task_dir>/agent-config/ for per-task configuration.

Source code in src/terok_executor/storage.py
@classmethod
def measure(cls, task_dir: Path) -> TaskStorageInfo:
    """Measure a single task's disk footprint.

    Expects the standard layout: ``<task_dir>/workspace-dangerous/``
    for the agent-writable code, ``<task_dir>/agent-config/`` for
    per-task configuration.
    """
    return cls(
        task_id=task_dir.name,
        workspace_bytes=_dir_bytes(task_dir / "workspace-dangerous"),
        agent_config_bytes=_dir_bytes(task_dir / "agent-config"),
    )

measure_all(tasks_root) classmethod

Measure every task under tasks_root, sorted by task ID.

Source code in src/terok_executor/storage.py
@classmethod
def measure_all(cls, tasks_root: Path) -> list[TaskStorageInfo]:
    """Measure every task under *tasks_root*, sorted by task ID."""
    if not tasks_root.is_dir():
        return []
    return sorted(
        (cls.measure(d) for d in tasks_root.iterdir() if d.is_dir()),
        key=lambda t: t.task_id,
    )

acp_socket_is_live(path)

Return True when a peer is currently accepting on path.

Distinguishes a live ACP daemon from a stale socket file left behind by a crash: a successful connect means a peer is listening, while ECONNREFUSED (and any other OSError) means the file is safe to unlink.

Source code in src/terok_executor/acp/daemon.py
def acp_socket_is_live(path: Path) -> bool:
    """Return ``True`` when a peer is currently accepting on *path*.

    Distinguishes a live ACP daemon from a stale socket file left
    behind by a crash: a successful ``connect`` means a peer is
    listening, while ``ECONNREFUSED`` (and any other ``OSError``)
    means the file is safe to unlink.
    """
    if not path.exists():
        return False
    try:
        with socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) as probe:
            probe.settimeout(0.2)
            probe.connect(str(path))
    except OSError:
        return False
    return True

list_authenticated_agents(*, db_path=None, scope=DEFAULT_CREDENTIAL_SCOPE)

Return provider names that have stored credentials in scope.

Pure query against CredentialDB — no probing, no container exec. Used by the host-side acp list to classify endpoints in its status display; the roster itself doesn't gate probing on this anymore (file-based auth like Claude's OAuth lives outside the vault, so a vault-only filter would silently hide working agents).

Source code in src/terok_executor/acp/roster.py
def list_authenticated_agents(
    *,
    db_path: Path | None = None,
    scope: str = DEFAULT_CREDENTIAL_SCOPE,
) -> list[str]:
    """Return provider names that have stored credentials in *scope*.

    Pure query against [`CredentialDB`][terok_sandbox.CredentialDB] — no probing,
    no container exec.  Used by the host-side ``acp list`` to classify
    endpoints in its status display; the roster itself doesn't gate
    probing on this anymore (file-based auth like Claude's OAuth lives
    outside the vault, so a vault-only filter would silently hide
    working agents).
    """
    cfg = SandboxConfig()
    # ``db_path`` override exists for tests + multi-instance hosts; the
    # cfg still owns the tier policy so this caller never has to know
    # about the chain mechanism (session-file / systemd-creds /
    # keyring / config).
    db = cfg.open_credential_db(db_path)
    try:
        return list(db.list_credentials(scope))
    finally:
        db.close()

build_project_image(*, dockerfile, context_dir, target_tag, extra_tags=(), build_args=None, labels=None, no_cache=False, pull_always=False)

Build an OCI image from a pre-rendered Dockerfile.

The thin podman build invoker that the three opinionated factories in this module (build_base_images, build_sidecar_image, and terok's project/L2 build) share. Callers own Dockerfile rendering, tag naming, label computation, and build-context staging — this function only assembles flags and shells out.

Parameters:

Name Type Description Default
dockerfile Path

Path to the pre-rendered Dockerfile (-f).

required
context_dir Path

Build context directory (final positional argument).

required
target_tag str

Primary image tag (-t).

required
extra_tags tuple[str, ...]

Additional tags applied to the same build (each becomes another -t on the command line — podman builds once and tags the result multiple times).

()
build_args dict[str, str] | None

--build-arg KEY=VALUE pairs.

None
labels dict[str, str] | None

--label KEY=VALUE pairs recorded in the OCI config.

None
no_cache bool

Force full rebuild.

False
pull_always bool

Pull the base image even if a local copy exists.

False

Raises:

Type Description
BuildError

When podman is not on PATH or the build exits non-zero.

Source code in src/terok_executor/container/build.py
def build_project_image(
    *,
    dockerfile: Path,
    context_dir: Path,
    target_tag: str,
    extra_tags: tuple[str, ...] = (),
    build_args: dict[str, str] | None = None,
    labels: dict[str, str] | None = None,
    no_cache: bool = False,
    pull_always: bool = False,
) -> None:
    """Build an OCI image from a pre-rendered Dockerfile.

    The thin ``podman build`` invoker that the three opinionated factories
    in this module ([`build_base_images`][terok_executor.container.build.build_base_images], [`build_sidecar_image`][terok_executor.container.build.build_sidecar_image],
    and terok's project/L2 build) share.  Callers own Dockerfile
    rendering, tag naming, label computation, and build-context staging —
    this function only assembles flags and shells out.

    Args:
        dockerfile: Path to the pre-rendered Dockerfile (``-f``).
        context_dir: Build context directory (final positional argument).
        target_tag: Primary image tag (``-t``).
        extra_tags: Additional tags applied to the same build (each becomes
            another ``-t`` on the command line — podman builds once and
            tags the result multiple times).
        build_args: ``--build-arg KEY=VALUE`` pairs.
        labels: ``--label KEY=VALUE`` pairs recorded in the OCI config.
        no_cache: Force full rebuild.
        pull_always: Pull the base image even if a local copy exists.

    Raises:
        BuildError: When podman is not on PATH or the build exits non-zero.
    """
    cmd = ["podman", "build", "-f", str(dockerfile)]
    for key, value in (build_args or {}).items():
        cmd += ["--build-arg", f"{key}={value}"]
    for key, value in (labels or {}).items():
        cmd += ["--label", f"{key}={value}"]
    cmd += ["-t", target_tag]
    for tag in extra_tags:
        cmd += ["-t", tag]
    if no_cache:
        cmd.append("--no-cache")
    if pull_always:
        cmd.append("--pull=always")
    cmd.append(str(context_dir))

    print("$", shlex.join(cmd))
    try:
        subprocess.run(cmd, check=True)
    except FileNotFoundError as exc:
        raise BuildError("podman not found; please install podman") from exc
    except subprocess.CalledProcessError as exc:
        raise BuildError(f"Image build failed: {exc}") from exc

seed_workspace_from_clone_cache(workspace_path, scope, *, origin_url=None, cfg=None)

Pre-populate workspace_path from the clone cache for scope.

Returns True if the workspace was successfully seeded.

After copying, rewrites the git origin remote to origin_url so that the in-container init script's sanity check (which compares origin against CODE_REPO) passes — the cache's origin points to a local file:// URL that won't match.

Skips seeding when the cache doesn't exist, the workspace already contains a .git directory, or the copy fails. Failures are logged and swallowed — the container falls back to a full clone.

Source code in src/terok_executor/container/cache.py
def seed_workspace_from_clone_cache(
    workspace_path: Path,
    scope: str,
    *,
    origin_url: str | None = None,
    cfg: SandboxConfig | None = None,
) -> bool:
    """Pre-populate *workspace_path* from the clone cache for *scope*.

    Returns ``True`` if the workspace was successfully seeded.

    After copying, rewrites the git origin remote to *origin_url* so that
    the in-container init script's sanity check (which compares origin
    against ``CODE_REPO``) passes — the cache's origin points to a local
    ``file://`` URL that won't match.

    Skips seeding when the cache doesn't exist, the workspace already
    contains a ``.git`` directory, or the copy fails.  Failures are
    logged and swallowed — the container falls back to a full clone.
    """
    if (workspace_path / ".git").is_dir():
        return False

    cache_dir = _resolve_cache_dir(scope, cfg)
    if cache_dir is None or not (cache_dir / ".git").is_dir():
        return False

    try:
        _logger.info("Seeding workspace from clone cache: %s%s", cache_dir, workspace_path)
        _copy_tree(cache_dir, workspace_path)
    except (OSError, shutil.Error, subprocess.CalledProcessError) as exc:
        _logger.warning("Clone cache seed failed (non-fatal): %s", exc)
        _wipe_workspace_contents(workspace_path)
        return False

    if not (workspace_path / ".git").is_dir():
        _logger.warning("Cache copy did not produce .git; falling back to container clone")
        _wipe_workspace_contents(workspace_path)
        return False

    if origin_url:
        _rewrite_origin(workspace_path, origin_url)

    return True

assemble_container_env(spec, roster, *, caller_manages_vault=False, per_container=None)

Assemble container environment variables and volume mounts.

This is the single source of truth for container env/volume assembly. Both AgentRunner._run() and terok's build_task_env_and_volumes() delegate here.

Parameters:

Name Type Description Default
spec ContainerEnvSpec

What the caller wants — all host↔container contract fields.

required
roster AgentRoster

Agent roster for shared mounts, vault routes, provider identity.

required
caller_manages_vault bool

When True, skip phantom-token injection here — the caller injects richer vault tokens itself (e.g. terok's per-provider OAuth tiers, socket transport, SSH signer). Shared config patches (api_base rewrites) still run because the vault is in use; only token injection is delegated.

False

Returns:

Type Description
ContainerEnvResult

Assembled env dict, volume tuple, and resolved task_dir.

Source code in src/terok_executor/container/env.py
def assemble_container_env(
    spec: ContainerEnvSpec,
    roster: AgentRoster,
    *,
    caller_manages_vault: bool = False,
    per_container: PerContainerResources | None = None,
) -> ContainerEnvResult:
    """Assemble container environment variables and volume mounts.

    This is the **single source of truth** for container env/volume assembly.
    Both ``AgentRunner._run()`` and terok's ``build_task_env_and_volumes()``
    delegate here.

    Args:
        spec: What the caller wants — all host↔container contract fields.
        roster: Agent roster for shared mounts, vault routes, provider identity.
        caller_manages_vault: When ``True``, skip phantom-token injection
            here — the caller injects richer vault tokens itself (e.g.
            terok's per-provider OAuth tiers, socket transport, SSH signer).
            Shared config patches (``api_base`` rewrites) still run because
            the vault **is** in use; only token injection is
            delegated.

    Returns:
        Assembled env dict, volume tuple, and resolved task_dir.
    """
    from terok_executor.paths import mounts_dir as _mounts_dir

    env: dict[str, str] = {}
    volumes: list[VolumeSpec] = []

    # 1. Base env
    env["TASK_ID"] = spec.task_id
    env["REPO_ROOT"] = "/workspace"
    env["GIT_RESET_MODE"] = "none"
    env["TEROK_CONTAINER_PROTOCOL"] = str(CONTAINER_PROTOCOL)
    env["CLAUDE_CONFIG_DIR"] = "/home/dev/.claude"

    # 1b. Timezone — explicit override wins, otherwise follow the host
    if tz := spec.timezone or detect_host_timezone():
        env["TZ"] = tz

    # 2. OpenCode provider env
    env.update(roster.collect_opencode_provider_env())

    # 3. Git identity
    env.update(_resolve_git_identity(spec, roster))

    # 4. Authorship env (for per-agent wrappers inside container)
    env["TEROK_GIT_AUTHORSHIP"] = spec.authorship
    env["HUMAN_GIT_NAME"] = spec.human_name
    env["HUMAN_GIT_EMAIL"] = spec.human_email

    # 5. Branch
    if spec.branch:
        env["GIT_BRANCH"] = spec.branch

    # 6. Repo URLs
    if spec.code_repo:
        env["CODE_REPO"] = spec.code_repo
    if spec.clone_from:
        env["CLONE_FROM"] = spec.clone_from

    # 7. Workspace volume
    volumes.append(VolumeSpec(spec.workspace_host_path, "/workspace", sharing=Sharing.PRIVATE))

    # 8. Shared config mounts from roster
    mounts_base = spec.envs_dir or _mounts_dir()
    task_dir = spec.task_dir or Path(tempfile.mkdtemp(prefix=f"terok-executor-{spec.task_id}-"))
    volumes += _shared_config_mounts(
        roster,
        mounts_base,
        expose_credential_providers=spec.expose_credential_providers,
    )

    # 8b. Re-apply vault config patches (idempotent — ensures shared mount
    #     dirs contain correct vault addresses even after state wipe).
    #
    #     NOT gated by caller_manages_vault: that flag only skips
    #     phantom-token injection here because the caller (terok) injects
    #     richer tokens itself — the vault is still in use and
    #     agents still need their config files rewritten to route through
    #     it.  Providers whose credential is exposed directly (Claude OAuth
    #     tier 3) are safe because they have no shared_config_patch.
    from terok_executor.credentials.vault_config import apply_shared_config_patches

    apply_shared_config_patches(
        roster,
        mounts_base,
        providers=spec.enabled_vault_patch_providers,
        disabled_providers=spec.disabled_vault_patch_providers,
    )

    # 9. Vault
    if not caller_manages_vault:
        env.update(
            _inject_vault_tokens(
                roster,
                spec.credential_scope,
                spec.task_id,
                vault_transport=spec.vault_transport,
                vault_required=spec.vault_required,
                credential_set=spec.credential_set,
                per_container=per_container,
            )
        )

    # 9b. Leaked credential scan (runs regardless of caller_manages_vault —
    #     the shared mounts exist either way)
    if spec.scan_leaked_creds:
        from terok_executor.credentials.vault_commands import scan_leaked_credentials

        leaked = scan_leaked_credentials(mounts_base)
        for provider, path in leaked:
            _logger.warning("Real credential in shared mount: %s: %s", provider, path)

    # 10. Agent config mount
    if spec.agent_config_dir:
        volumes.append(
            VolumeSpec(spec.agent_config_dir, "/home/dev/.terok", sharing=Sharing.PRIVATE)
        )

    # 11. Unrestricted mode
    if spec.unrestricted:
        env["TEROK_UNRESTRICTED"] = "1"
        env.update(roster.collect_all_auto_approve_env())

    # 12. Shared task directory
    if spec.shared_dir:
        spec.shared_dir.mkdir(parents=True, exist_ok=True)
        volumes.append(VolumeSpec(spec.shared_dir, spec.shared_mount))
        env["TEROK_SHARED_DIR"] = spec.shared_mount

    # 13. Extra volumes
    volumes.extend(spec.extra_volumes)

    return ContainerEnvResult(env=env, volumes=tuple(volumes), task_dir=task_dir)

inject_prompt(container_name, prompt_text)

Write a follow-up prompt into a stopped sealed container.

Writes prompt_text to a temp file and copies it into the container via podman cp. Works on stopped containers (unlike podman exec), which is the expected state during headless follow-ups.

Source code in src/terok_executor/container/inject.py
def inject_prompt(container_name: str, prompt_text: str) -> None:
    """Write a follow-up prompt into a stopped sealed container.

    Writes *prompt_text* to a temp file and copies it into the container
    via ``podman cp``.  Works on stopped containers (unlike ``podman exec``),
    which is the expected state during headless follow-ups.
    """
    from terok_executor.integrations.sandbox import Sandbox

    with tempfile.TemporaryDirectory() as td:
        prompt_file = Path(td) / "prompt.txt"
        prompt_file.write_text(prompt_text, encoding="utf-8")
        Sandbox().copy_to(container_name, prompt_file, "/home/dev/.terok/prompt.txt")

prepare_oauth_session(provider, project_id, *, mounts_dir, image, expose_token=False, credential_set='default')

Build an AuthSession without running it.

Creates a fresh temp dir, computes the podman run argv, and cleans up any leftover container of the same name (so re-auth after a previous abort isn't blocked). The caller drives execution and credential capture; see AuthSession.

The temp dir uses a clean slate so the vendor auth flow re-runs end to end — no stale config, no cached sessions.

Source code in src/terok_executor/credentials/auth.py
def prepare_oauth_session(
    provider: AuthProvider,
    project_id: str | None,
    *,
    mounts_dir: Path,
    image: str,
    expose_token: bool = False,
    credential_set: str = "default",
) -> AuthSession:
    """Build an [`AuthSession`][terok_executor.AuthSession] without running it.

    Creates a fresh temp dir, computes the ``podman run`` argv, and
    cleans up any leftover container of the same name (so re-auth
    after a previous abort isn't blocked).  The caller drives execution
    and credential capture; see [`AuthSession`][terok_executor.AuthSession].

    The temp dir uses a clean slate so the vendor auth flow re-runs end
    to end — no stale config, no cached sessions.
    """
    _check_podman()

    tmpdir = tempfile.TemporaryDirectory(prefix=f"terok-auth-{provider.name}-")
    host_dir = Path(tmpdir.name)

    # ``project_id`` must lead the container name; Podman rejects names
    # starting with ``_`` or other non-alphanumeric chars, so the
    # host-wide caller passes ``None`` and we fall back to ``host``.
    name_prefix = project_id or "host"
    container_name = f"{name_prefix}-auth-{provider.name}"
    _cleanup_existing_container(container_name)

    cmd = ["podman", "run", "--rm", *podman_userns_args(), "-it"]
    if provider.extra_run_args:
        cmd.extend(provider.extra_run_args)
    cmd.extend(["-v", f"{host_dir}:{provider.container_mount}:Z"])
    cmd.extend(["--name", container_name])
    cmd.append(image)
    cmd.extend(provider.command)

    scope = f"for project: {project_id}" if project_id else "(host-wide)"
    banner_lines = [
        f"Authenticating {provider.label} {scope}",
        "",
        *provider.banner_hint.splitlines(),
        "",
        f"$ {' '.join(map(str, cmd))}",
        "",
    ]

    return AuthSession(
        provider=provider,
        project_id=project_id,
        container_name=container_name,
        argv=cmd,
        banner="\n".join(banner_lines),
        auth_dir=host_dir,
        mounts_dir=mounts_dir,
        credential_set=credential_set,
        expose_token=expose_token,
        _tmpdir=tmpdir,
    )

store_api_key(provider, api_key, credential_set='default')

Store an API key directly in the credential DB (no container needed).

This is the non-interactive fast path for automated workflows and CI. The key is stored as {"type": "api_key", "key": "<value>"}.

Source code in src/terok_executor/credentials/auth.py
def store_api_key(
    provider: str,
    api_key: str,
    credential_set: str = "default",
) -> None:
    """Store an API key directly in the credential DB (no container needed).

    This is the non-interactive fast path for automated workflows and CI.
    The key is stored as ``{"type": "api_key", "key": "<value>"}``.
    """
    from terok_executor.integrations.sandbox import SandboxConfig

    cfg = SandboxConfig()
    db = cfg.open_credential_db(prompt_on_tty=True)
    try:
        db.store_credential(credential_set, provider, {"type": "api_key", "key": api_key})
        print(f"API key stored for {provider} (set: {credential_set})")
    finally:
        db.close()

scan_leaked_credentials(mounts_base)

Return (provider, host_path) for credential files found in shared mounts.

When the vault is active, real secrets should only live in the vault's sqlite DB — not in the shared config directories that get mounted into containers. This function checks each routed provider's mount for credential files that would leak real tokens alongside phantom ones.

Files injected by _write_claude_credentials_file are recognised by their dummy accessToken marker and skipped.

Symlinks are rejected to prevent a container from tricking the scan into reading arbitrary host files via a crafted symlink in the shared mount.

Source code in src/terok_executor/credentials/vault_commands.py
def scan_leaked_credentials(mounts_base: Path) -> list[tuple[str, Path]]:
    """Return ``(provider, host_path)`` for credential files found in shared mounts.

    When the vault is active, real secrets should only live in the
    vault's sqlite DB — not in the shared config directories that get mounted
    into containers.  This function checks each routed provider's mount for
    credential files that would leak real tokens alongside phantom ones.

    Files injected by `_write_claude_credentials_file`
    are recognised by their dummy ``accessToken`` marker and skipped.

    Symlinks are rejected to prevent a container from tricking the scan into
    reading arbitrary host files via a crafted symlink in the shared mount.
    """
    import stat

    from terok_executor.roster import AgentRoster

    roster = AgentRoster.shared()
    base_resolved = mounts_base.resolve(strict=False)
    leaked: list[tuple[str, Path]] = []
    for name, route in roster.vault_routes.items():
        if not route.credential_file:
            continue
        auth = roster.auth_providers.get(name)
        if not auth:
            continue
        try:
            path = mounts_base / auth.host_dir_name / route.credential_file
            # lstat: do not follow symlinks — reject them outright
            st = path.lstat()
            if stat.S_ISLNK(st.st_mode) or not stat.S_ISREG(st.st_mode):
                continue
            # Ensure resolved path stays within the mounts base
            if base_resolved not in path.resolve(strict=True).parents:
                continue
            if st.st_size > 0 and not (
                _is_injected_credentials_file(path) or _is_injected_codex_auth_file(path)
            ):
                leaked.append((name, path))
        except (OSError, TypeError) as exc:
            # Silently skipping turns a real leak into a no-result: the
            # operator would believe the scan was clean.  Surface a
            # warning so it's obvious which provider was not checked
            # and why; the loop continues so other providers still get
            # scanned.
            print(
                f"Warning [vault]: credential leak scan skipped {name!r}: {exc}",
                file=sys.stderr,
            )
            continue
    return leaked

ensure_krun_host_keypair(*, cfg=None, runtime_dir=None)

Load (or mint, first call) the %host keypair and materialise it to tmpfs.

The vault is the system of record: the keypair lives in the sandbox credential DB under the %host infrastructure scope. This helper opens the DB, calls ensure_infra_keypair (which generates the key on first call and reloads it thereafter), and writes the OpenSSH-PEM private + the public-key line into runtime_dir (default: namespace_runtime_dir()).

The orchestrator bind-mounts public_path into the running krun guest at /etc/ssh/authorized_keys.d/terok so the guest's sshd accepts our private key. The L0 image itself ships an empty placeholder at that path; the bind-mount overlays it.

Rotation = clear the %host scope in the vault, then re-run. Typically called per task launch under krun (idempotent — loads on subsequent calls). New tasks pick up the new key; in-flight tasks keep what they had until they're stopped.

Requires the vault to be unlocked — the krun runtime is gated on experimental: true upstream and assumes the operator has the vault open for the session. A NoPassphraseError propagates unchanged so the orchestrator can render its own remediation hint.

Parameters:

Name Type Description Default
cfg SandboxConfig | None

Sandbox config used to open the credential DB. None means use the zero-arg default — appropriate for standalone executor flows; terok injects its own enriched config when calling.

None
runtime_dir Path | None

Override for the tmpfs cache directory. None uses namespace_runtime_dir, with a hard refusal to fall back to persistent disk.

None
Source code in src/terok_executor/krun.py
def ensure_krun_host_keypair(
    *,
    cfg: SandboxConfig | None = None,
    runtime_dir: Path | None = None,
) -> KrunHostKeypair:
    """Load (or mint, first call) the ``%host`` keypair and materialise it to tmpfs.

    The vault is the system of record: the keypair lives in the sandbox
    credential DB under the ``%host`` infrastructure scope.  This
    helper opens the DB, calls
    [`ensure_infra_keypair`][terok_sandbox.ensure_infra_keypair] (which
    generates the key on first call and reloads it thereafter), and
    writes the OpenSSH-PEM private + the public-key line into
    *runtime_dir* (default:
    [`namespace_runtime_dir()`][terok_util.paths.namespace_runtime_dir]).

    The orchestrator bind-mounts ``public_path`` into the running
    krun guest at ``/etc/ssh/authorized_keys.d/terok`` so the
    guest's sshd accepts our private key.  The L0 image itself ships
    an empty placeholder at that path; the bind-mount overlays it.

    Rotation = clear the ``%host`` scope in the vault, then re-run.
    Typically called per task launch under krun (idempotent — loads
    on subsequent calls).  New tasks pick up the new key; in-flight
    tasks keep what they had until they're stopped.

    Requires the vault to be unlocked — the krun runtime is gated on
    ``experimental: true`` upstream and assumes the operator has the
    vault open for the session.  A ``NoPassphraseError`` propagates
    unchanged so the orchestrator can render its own remediation hint.

    Args:
        cfg: Sandbox config used to open the credential DB.  ``None``
            means use the zero-arg default — appropriate for standalone
            executor flows; terok injects its own enriched config when
            calling.
        runtime_dir: Override for the tmpfs cache directory.  ``None``
            uses [`namespace_runtime_dir`][terok_util.paths.namespace_runtime_dir],
            with a hard refusal to fall back to persistent disk.
    """
    target_dir = _ensure_safe_runtime_dir(runtime_dir)
    private = target_dir / f"{_HOST_KEYPAIR_BASENAME}.key"
    public = target_dir / f"{_HOST_KEYPAIR_BASENAME}.key.pub"

    db = (cfg or SandboxConfig()).open_credential_db(prompt_on_tty=False)
    try:
        infra = ensure_infra_keypair("%host", db=db, comment="krun-host (terok)")
    finally:
        db.close()

    _write_atomic(private, infra.private_pem, mode=0o600)
    _write_atomic(public, (infra.public_line + "\n").encode(), mode=0o644)
    return KrunHostKeypair(
        private_path=private,
        public_path=public,
        public_line=infra.public_line,
        fingerprint=infra.fingerprint,
        created=infra.created,
    )

parse_md_agent(file_path)

Parse a .md file with YAML frontmatter into an agent dict.

Expected format

name: agent-name description: ... tools: [Read, Grep] model: sonnet


System prompt body...

Source code in src/terok_executor/provider/agents.py
def parse_md_agent(file_path: str) -> dict:
    """Parse a .md file with YAML frontmatter into an agent dict.

    Expected format:
        ---
        name: agent-name
        description: ...
        tools: [Read, Grep]
        model: sonnet
        ---
        System prompt body...
    """
    path = Path(file_path)
    if not path.is_file():
        return {}
    content = path.read_text(encoding="utf-8")
    # Split YAML frontmatter from body
    if content.startswith("---"):
        parts = content.split("---", 2)
        if len(parts) >= 3:
            frontmatter = yaml.load(parts[1]) or {}
            if not isinstance(frontmatter, dict):
                frontmatter = {}
            body = parts[2].strip()
            frontmatter["prompt"] = body
            return frontmatter
    # No frontmatter: treat entire file as prompt
    return {"prompt": content.strip()}

prepare_agent_config_dir(spec)

Create and populate the agent-config directory for a task.

Writes: - terok-executor.sh (always) — wrapper functions with git env vars - agents.json (only when provider supports it and sub-agents are non-empty) - prompt.txt (if prompt given, headless only) - instructions.md (always) — custom instructions or a neutral default - /_claude-config/settings.json — SessionStart hook (Claude only) - opencode.json entries — instructions path injected into shared OpenCode and Blablador configs

Parameters:

Name Type Description Default
spec AgentConfigSpec

All agent-config parameters bundled in an AgentConfigSpec.

required

Returns the agent_config_dir path.

Source code in src/terok_executor/provider/agents.py
def prepare_agent_config_dir(spec: AgentConfigSpec) -> Path:
    """Create and populate the agent-config directory for a task.

    Writes:
    - terok-executor.sh (always) — wrapper functions with git env vars
    - agents.json (only when provider supports it and sub-agents are non-empty)
    - prompt.txt (if prompt given, headless only)
    - instructions.md (always) — custom instructions or a neutral default
    - <envs>/_claude-config/settings.json — SessionStart hook (Claude only)
    - opencode.json entries — ``instructions`` path injected into shared
      OpenCode and Blablador configs

    Args:
        spec: All agent-config parameters bundled in an [`AgentConfigSpec`][terok_executor.provider.agents.AgentConfigSpec].

    Returns the agent_config_dir path.
    """
    from .providers import get_provider as _get_provider

    resolved = _get_provider(spec.provider, default_agent=spec.default_agent)

    task_dir = spec.tasks_root / str(spec.task_id)
    agent_config_dir = task_dir / "agent-config"
    ensure_dir(agent_config_dir)

    # Build agents JSON — only for providers that support --agents (Claude)
    has_agents = False
    if resolved.supports_agents_json and spec.subagents:
        agents_json = _subagents_to_json(spec.subagents, spec.selected_agents)
        agents_dict = json.loads(agents_json)
        if agents_dict:  # non-empty dict
            (agent_config_dir / "agents.json").write_text(agents_json, encoding="utf-8")
            has_agents = True
    elif spec.subagents or spec.selected_agents:
        import warnings

        warnings.warn(
            f"{resolved.label} does not support sub-agents (--agents); "
            f"sub-agent definitions will be ignored.",
            stacklevel=2,
        )

    # Write instructions file — always present so opencode.json `instructions`
    # references never point to a missing file.  When no custom instructions
    # are configured, a neutral default is used.
    _DEFAULT_INSTRUCTIONS = "Follow the project's coding conventions and existing patterns."

    instructions_text = spec.instructions or _DEFAULT_INSTRUCTIONS
    (agent_config_dir / "instructions.md").write_text(instructions_text, encoding="utf-8")

    # Inject instructions path into opencode.json configs on the host so
    # all OpenCode-based providers discover them natively (works for both
    # interactive and headless modes).
    mounts_base = spec.mounts_base
    if mounts_base is None:
        raise ValueError("mounts_base is required in AgentConfigSpec")
    _inject_opencode_instructions(mounts_base / "_opencode-config" / "opencode.json")
    for _p in AGENT_PROVIDERS.values():
        if _p.opencode_config is not None:
            _inject_opencode_instructions(
                mounts_base / f"_{_p.name}-config" / "opencode" / "opencode.json"
            )

    # Write shell wrapper functions for ALL providers so interactive CLI users
    # can invoke any agent (each provider gets its own shell function).
    from .wrappers import generate_all_wrappers

    wrapper = generate_all_wrappers(has_agents)
    (agent_config_dir / "terok-executor.sh").write_text(wrapper, encoding="utf-8")

    # Write SessionStart hook — only for providers that support it (Claude)
    if resolved.supports_session_hook:
        shared_claude_dir = mounts_base / "_claude-config"
        ensure_dir_writable(shared_claude_dir, "_claude-config")
        _write_session_hook(shared_claude_dir / "settings.json")

    # Prompt (headless only)
    if spec.prompt is not None:
        (agent_config_dir / "prompt.txt").write_text(spec.prompt, encoding="utf-8")

    return agent_config_dir

bundled_default_instructions()

Read and return the bundled default instructions from package resources.

Source code in src/terok_executor/provider/instructions.py
def bundled_default_instructions() -> str:
    """Read and return the bundled default instructions from package resources."""
    ref = importlib.resources.files("terok_executor.resources.instructions").joinpath("default.md")
    return ref.read_text(encoding="utf-8")

resolve_instructions(config, provider_name, project_root=None)

Resolve instructions from a merged config dict.

Supports: - Flat string: returned as-is - Per-provider dict: uses resolve_provider_value, falls back to _default - List (with _inherit): splices bundled default at each _inherit sentinel - Absent/None: returns bundled default

After resolving the YAML value, appends the contents of project_root/instructions.md (if it exists and is non-empty).

Returns the final instructions text.

Source code in src/terok_executor/provider/instructions.py
def resolve_instructions(
    config: dict[str, Any],
    provider_name: str,
    project_root: Path | None = None,
) -> str:
    """Resolve instructions from a merged config dict.

    Supports:
    - Flat string: returned as-is
    - Per-provider dict: uses [`resolve_provider_value`][terok_executor.resolve_provider_value], falls back to ``_default``
    - List (with ``_inherit``): splices bundled default at each ``_inherit`` sentinel
    - Absent/None: returns bundled default

    After resolving the YAML value, appends the contents of
    ``project_root/instructions.md`` (if it exists and is non-empty).

    Returns the final instructions text.
    """
    from .providers import resolve_provider_value

    val = config.get("instructions")
    default = bundled_default_instructions()

    if val is None:
        base = default
    elif isinstance(val, dict):
        resolved = resolve_provider_value("instructions", config, provider_name)
        if resolved is None:
            base = default
        elif isinstance(resolved, list):
            base = _splice_inherit(resolved, default)
        elif resolved == _INHERIT_SENTINEL:
            base = default
        else:
            base = str(resolved)
    elif isinstance(val, list):
        base = _splice_inherit(val, default)
    elif val == _INHERIT_SENTINEL:
        # Bare _inherit string → same as absent (use bundled default)
        base = default
    else:
        base = str(val)

    # Append standalone instructions file (purely additive)
    file_text = _read_instructions_file(project_root)
    if file_text:
        return f"{base}\n\n{file_text}" if base else file_text
    return base

get_provider(name, *, default_agent=None)

Resolve a provider name against the global AGENT_PROVIDERS registry.

Convenience wrapper around resolve_provider.

Source code in src/terok_executor/provider/providers.py
def get_provider(name: str | None, *, default_agent: str | None = None) -> AgentProvider:
    """Resolve a provider name against the global [`AGENT_PROVIDERS`][terok_executor.provider.providers.AGENT_PROVIDERS] registry.

    Convenience wrapper around [`resolve_provider`][terok_executor.provider.providers.resolve_provider].
    """
    return resolve_provider(AGENT_PROVIDERS, name, default_agent=default_agent)

resolve_provider_value(key, config, provider_name)

Extract a provider-aware config value.

Supports two forms:

  • Flat valuemodel: opus → same for all providers.
  • Per-provider dictmodel: {claude: opus, codex: o3, _default: fast} → looks up provider_name, falls back to _default, then None.

Returns None when the key is absent or has no match for the provider.

Null override behaviour: when a per-provider dict maps a provider to null (Python None), that None is treated as "no value" and the resolver falls back to _default. This is intentional — it allows a lower-priority config layer to set a provider-specific value that a higher-priority layer can effectively unset by mapping it to null, letting the _default (or None) bubble up instead.

Internal to provider config resolution — full config-stack composition (build_agent_config_stack, resolve_agent_config) lives in terok, which owns the global/project/preset layer semantics.

Source code in src/terok_executor/provider/providers.py
def resolve_provider_value(
    key: str,
    config: dict[str, Any],
    provider_name: str,
) -> Any | None:
    """Extract a provider-aware config value.

    Supports two forms:

    * **Flat value** — ``model: opus`` → same for all providers.
    * **Per-provider dict** — ``model: {claude: opus, codex: o3, _default: fast}``
      → looks up *provider_name*, falls back to ``_default``, then ``None``.

    Returns ``None`` when the key is absent or has no match for the provider.

    **Null override behaviour**: when a per-provider dict maps a provider to
    ``null`` (Python ``None``), that ``None`` is treated as "no value" and the
    resolver falls back to ``_default``.  This is intentional — it allows a
    lower-priority config layer to set a provider-specific value that a
    higher-priority layer can effectively *unset* by mapping it to ``null``,
    letting the ``_default`` (or ``None``) bubble up instead.

    Internal to provider config resolution — full config-stack composition
    (``build_agent_config_stack``, ``resolve_agent_config``) lives in terok,
    which owns the global/project/preset layer semantics.
    """
    val = config.get(key)
    if val is None:
        return None
    if isinstance(val, dict):
        provider_val = val.get(provider_name)
        if provider_val is not None:
            return provider_val
        return val.get("_default")
    return val

ensure_sandbox_ready(*, cfg=None, no_vault=False, **aggregator_kwargs)

Generate vault routes, then run the sandbox install aggregator.

Regenerating routes.json up front is what makes routing config current before any launch — the per-container supervisor reads it at container start. A bare aggregator call would leave a stale routes.json in place and credential fetch would break on the next terok-executor run until the operator remembers to run vault routes.

no_vault gates the routes pre-step (if vault isn't being touched, don't regenerate); other no_* flags flow through to the aggregator.

Routes regeneration renders a Vault routes stage line so it sits in the same column as the aggregator's own output rather than failing silently above it — a corrupt YAML roster is the most plausible reason for setup to fail before the aggregator even starts, and a stage-shaped failure beats an unframed traceback.

Source code in src/terok_executor/sandbox.py
def ensure_sandbox_ready(
    *,
    cfg: SandboxConfig | None = None,
    no_vault: bool = False,
    **aggregator_kwargs: Any,
) -> None:
    """Generate vault routes, then run the sandbox install aggregator.

    Regenerating ``routes.json`` up front is what makes routing config
    current before any launch — the per-container supervisor reads it
    at container start.  A bare aggregator call would leave a stale
    ``routes.json`` in place and credential fetch would break on the
    next ``terok-executor run`` until the operator remembers to run
    ``vault routes``.

    ``no_vault`` gates the routes pre-step (if vault isn't being
    touched, don't regenerate); other ``no_*`` flags flow through to
    the aggregator.

    Routes regeneration renders a ``Vault routes`` stage line so it
    sits in the same column as the aggregator's own output rather
    than failing silently above it — a corrupt YAML roster is the
    most plausible reason for setup to fail before the aggregator
    even starts, and a stage-shaped failure beats an unframed
    traceback.
    """
    from terok_executor.integrations.sandbox import _handle_sandbox_setup, stage_line
    from terok_executor.roster import AgentRoster

    if not no_vault:
        with stage_line("Vault routes") as s:
            AgentRoster.shared().ensure_vault_routes(cfg=cfg)
            s.ok("regenerated")
    _handle_sandbox_setup(cfg=cfg, no_vault=no_vault, **aggregator_kwargs)