Skip to content

token_broker

token_broker

Vault token broker — HTTP/WebSocket reverse proxy with secret injection.

This module has zero terok imports. It is a self-contained security component embedded by VaultProxy inside each per-container supervisor: an aiohttp app that listens on a per-container Unix socket (or 127.0.0.1 TCP port), validates phantom tokens against a SQLCipher database, injects real credentials from the same database, and forwards requests to upstream API servers.

Task containers see only phantom API keys (worthless outside the broker). Real secrets never enter the container filesystem or environment.

Route config (JSON, generated by terok-executor from the YAML registry)::

{
    "claude": {
        "upstream": "https://api.anthropic.com",
        "auth_header": "Authorization",
        "auth_prefix": "Bearer ",
    }
}

UnixBind(socket_path) dataclass

Unix-domain bind for the vault HTTP proxy.

Used by the supervisor in socket mode. The proxy binds the socket with mode 0600 so only the same-UID rootless container peers via the bind-mount.

socket_path instance-attribute

TcpBind(host, port) dataclass

TCP bind for the vault HTTP proxy.

Used by the supervisor in TCP mode (krun / SELinux-restricted distros). Always pins to 127.0.0.1; the shield egress firewall would refuse loopback traffic on any other interface anyway.

host instance-attribute

port instance-attribute

VaultProxy(*, db_path, scope_id, bind, routes_path=None, audit_path=None, runtime_dir=None)

Embeddable aiohttp vault — one per container, owned by the supervisor.

Each per-container supervisor builds a fresh VaultProxy bound to a per-container socket (or TCP port); it serves only its own container and lives only as long as that container itself.

Two transport flavours, picked by bind at construction:

  • UnixBind — a single mode-0600 Unix socket the rootless container reaches via bind-mount. The supervisor places the socket under $XDG_RUNTIME_DIR/terok/vault/<container_id>.sock.
  • TcpBind127.0.0.1:<port> for the krun / SELinux-restricted path; the shield's loopback allowlist covers the container side.

The OAuth refresh task runs every _REFRESH_INTERVAL seconds inside the same loop. Cross-supervisor coordination uses a non-blocking flock on $XDG_RUNTIME_DIR/terok/vault/locks/refresh-<credential_set>-<provider>.lock: a contended lock means another supervisor is already refreshing the same row, so we skip and pick up the fresh value on the next read.

Wire the proxy to the credential DB and a transport.

scope_id is informational at present — the broker authorises per-token via _TokenDB.lookup_token, not by scope. Reserved for future per-supervisor scope-filtering (e.g. refusing tokens outside the supervisor's container scope).

routes_path defaults to $XDG_CONFIG_HOME/terok/vault/routes.json when omitted — the canonical location terok-executor writes the route table to.

runtime_dir — explicit runtime root for the cross-supervisor refresh flock files. Required when the proxy runs in crun's rootless userns (getuid() reports 0 there but the actual host runtime dir is under the operator's uid); if omitted, falls back to $XDG_RUNTIME_DIR / /run/user/<uid>.

Source code in src/terok_sandbox/vault/daemon/token_broker.py
def __init__(
    self,
    *,
    db_path: Path | str,
    scope_id: str | None,
    bind: UnixBind | TcpBind,
    routes_path: Path | str | None = None,
    audit_path: Path | None = None,
    runtime_dir: Path | None = None,
) -> None:
    """Wire the proxy to the credential DB and a transport.

    *scope_id* is informational at present — the broker authorises
    per-token via ``_TokenDB.lookup_token``,
    not by scope.  Reserved for future per-supervisor scope-filtering
    (e.g. refusing tokens outside the supervisor's container scope).

    *routes_path* defaults to ``$XDG_CONFIG_HOME/terok/vault/routes.json``
    when omitted — the canonical location ``terok-executor`` writes
    the route table to.

    *runtime_dir* — explicit runtime root for the cross-supervisor
    refresh ``flock`` files.  Required when the proxy runs in
    crun's rootless userns (``getuid()`` reports ``0`` there but
    the actual host runtime dir is under the operator's uid); if
    omitted, falls back to ``$XDG_RUNTIME_DIR`` / ``/run/user/<uid>``.
    """
    from terok_sandbox.config import SandboxConfig

    cfg = SandboxConfig()
    self._db_path = str(db_path)
    self._scope_id = scope_id
    self._bind = bind
    self._routes_path = str(routes_path) if routes_path else str(cfg.routes_path)
    self._audit_path = audit_path
    self._runtime_dir = runtime_dir
    self._app: web.Application | None = None
    self._runner: Any | None = None
    self._site: Any | None = None
    self._refresh_lock_dir: Path | None = None

bind property

Return the transport binding the proxy was constructed with.

scope_id property

Return the scope the proxy was constructed with (informational).

start() async

Build the aiohttp app and bring the listener up.

Initial-pass refresh runs at startup (same shape as _refresh_loop) so the first request never blocks on a slow token-endpoint round-trip — the periodic task continues from there.

Source code in src/terok_sandbox/vault/daemon/token_broker.py
async def start(self) -> None:
    """Build the aiohttp app and bring the listener up.

    Initial-pass refresh runs at startup (same shape as
    ``_refresh_loop``)
    so the first request never blocks on a slow token-endpoint
    round-trip — the periodic task continues from there.
    """
    from aiohttp.web_runner import AppRunner, SockSite, TCPSite

    self._app = _build_app(self._db_path, self._routes_path, audit_path=self._audit_path)
    self._refresh_lock_dir = self._compute_refresh_lock_dir()
    # The refresh loop reads the lock dir off the app, so pin the
    # one resolved here (from the injected runtime_dir) over the
    # ambient-env default _build_app seeded.
    self._app[_KEY_LOCK_DIR] = self._refresh_lock_dir

    self._runner = AppRunner(self._app, access_log=_logger)
    await self._runner.setup()

    if isinstance(self._bind, UnixBind):
        sock = self._bind_unix_socket(self._bind.socket_path)
        self._site = SockSite(self._runner, sock)
    else:
        self._site = TCPSite(self._runner, self._bind.host, self._bind.port)
    await self._site.start()
    _logger.info("VaultProxy started on %s", self._describe_bind())

stop() async

Tear down the listener and free the aiohttp resources.

Idempotent — a stop call on a never-started proxy is a no-op.

Source code in src/terok_sandbox/vault/daemon/token_broker.py
async def stop(self) -> None:
    """Tear down the listener and free the aiohttp resources.

    Idempotent — a stop call on a never-started proxy is a no-op.
    """
    if self._runner is not None:
        await self._runner.cleanup()
        self._runner = None
        self._site = None
        self._app = None

acquire_refresh_lock(lock_dir, credential_set, provider)

Try to acquire an exclusive flock on the per-credential refresh file.

Returns the held file-descriptor on success, None when another supervisor is already refreshing the same row (the lock is non-blocking, LOCK_NB). Callers os.close the fd to release. Soft-fails to None on any I/O error — the periodic task will pick the credential up again on the next tick.

The lock file lives under $XDG_RUNTIME_DIR/terok/vault/locks/refresh-<credential_set>-<provider>.lock; the directory is created with mode 0700 on demand. The credential_set / provider components are sanitised (see _safe_lock_component) so the lock file always stays inside lock_dir.

Source code in src/terok_sandbox/vault/daemon/token_broker.py
def acquire_refresh_lock(lock_dir: Path, credential_set: str, provider: str) -> int | None:
    """Try to acquire an exclusive ``flock`` on the per-credential refresh file.

    Returns the held file-descriptor on success, ``None`` when another
    supervisor is already refreshing the same row (the lock is
    non-blocking, ``LOCK_NB``).  Callers ``os.close`` the fd to release.
    Soft-fails to ``None`` on any I/O error — the periodic task will
    pick the credential up again on the next tick.

    The lock file lives under
    ``$XDG_RUNTIME_DIR/terok/vault/locks/refresh-<credential_set>-<provider>.lock``;
    the directory is created with mode 0700 on demand.  The
    ``credential_set`` / ``provider`` components are sanitised (see
    ``_safe_lock_component``) so the lock file always stays inside
    *lock_dir*.
    """
    import fcntl
    import os as _os

    try:
        lock_dir.mkdir(parents=True, exist_ok=True)
        _os.chmod(lock_dir, 0o700)
    except OSError as exc:
        _logger.debug("refresh lock dir setup failed: %s", exc)
        return None
    safe_cs = _safe_lock_component(credential_set)
    safe_provider = _safe_lock_component(provider)
    lock_path = lock_dir / f"refresh-{safe_cs}-{safe_provider}.lock"
    try:
        fd = _os.open(lock_path, _os.O_RDWR | _os.O_CREAT | _os.O_CLOEXEC, 0o600)
    except OSError as exc:
        _logger.debug("refresh lock open failed: %s", exc)
        return None
    try:
        fcntl.flock(fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
    except OSError:
        _os.close(fd)
        return None
    return fd

release_refresh_lock(fd)

Release the flock held by fd (see acquire_refresh_lock).

Source code in src/terok_sandbox/vault/daemon/token_broker.py
def release_refresh_lock(fd: int) -> None:
    """Release the ``flock`` held by *fd* (see [`acquire_refresh_lock`][terok_sandbox.vault.daemon.token_broker.acquire_refresh_lock])."""
    import os as _os

    try:
        _os.close(fd)
    except OSError as exc:
        _logger.debug("refresh lock close failed: %s", exc)