Skip to content

Security & Design

Threat model

terok-shield provides egress network filtering for rootless Podman containers running untrusted workloads (AI coding agents).

Attacker model: arbitrary code execution inside the container, no host access. The container boundary (rootless Podman + user namespaces) is assumed intact. terok-shield adds the network layer: even with full container control, the workload cannot reach arbitrary internet hosts.

Host privilege model: no root required. The firewall operates within rootless Podman's user namespace.

Goals:

  • Default-deny outbound connectivity
  • Allowlist-based access to specific destinations (domains or IPs)
  • Block private-range traffic (RFC 1918 + RFC 4193) unless explicitly allowlisted (prevent lateral movement)
  • Log a notice when private addresses or large CIDRs are allowlisted
  • Audit log all firewall events
  • Fail closed on any hook failure

Non-goals:

  • Inbound filtering (containers are not exposed)
  • DNS exfiltration through allowed resolvers
  • Host kernel / container-escape attacks
  • Application-layer inspection

Security boundary: nft.py

All nftables rulesets are generated by nft.py. This is the auditable security boundary — it can be reviewed in isolation.

Import isolation: only stdlib (ipaddress, re, textwrap) + nft_constants.py (literals: NFT_TABLE, PASTA_DNS, PRIVATE_RANGES, BYPASS_LOG_PREFIX). Enforced by an AST import isolation test and bandit SAST scan.

Input validation: all values are validated before string interpolation:

  • safe_ip() — validates and normalizes IPs/CIDRs via ipaddress.ip_address/ip_network (dual-stack); returns canonical string form to ensure reliable state-file comparisons
  • _safe_port() — validates port range, rejects bools

Self-verification: verify_ruleset() checks post-apply invariants: drop policy present, reject type present, deny log prefix present, all private ranges (RFC 1918 + RFC 4193/4291) blocked, both allow sets declared. verify_bypass_ruleset() additionally checks private-range rules when allow_all=False.

Trust boundaries

┌─────────────────────────────────────────────┐
│  Host                                       │
│                                             │
│  ┌──────────────────────────────────────┐   │
│  │  Container netns                     │   │
│  │                                      │   │
│  │  ┌────────────────────────────────┐  │   │
│  │  │  nftables rules               │  │   │
│  │  │  (applied by OCI hook)        │  │   │
│  │  │                               │  │   │
│  │  │  policy: DROP                        │  │   │
│  │  │  allow: DNS, lo, @allow_v4/v6        │  │   │
│  │  │  gateway: @gateway_v4/v6 (loopback)  │  │   │
│  │  │  reject: RFC1918, RFC4193            │  │   │
│  │  └────────────────────────────────┘  │   │
│  │                                      │   │
│  │  ┌────────────────────────────────┐  │   │
│  │  │  Workload (untrusted)         │  │   │
│  │  │  CAP_NET_ADMIN dropped        │  │   │
│  │  │  CAP_NET_RAW dropped          │  │   │
│  │  └────────────────────────────────┘  │   │
│  └──────────────────────────────────────┘   │
│                                             │
│  Host services (loopback only)               │
└─────────────────────────────────────────────┘

The workload cannot modify nftables rules because CAP_NET_ADMIN and CAP_NET_RAW are dropped.

Chain evaluation order

Hook mode (per-container netns, output chain):

loopback → established → DNS → gateway ports → loopback ports → allow_v4/v6 → private-range reject → deny all

Rule ordering rationale: the allow sets (@allow_v4, @allow_v6) are evaluated before private-range reject rules. This lets operators allowlist specific RFC 1918 or RFC 4193 addresses (e.g., local infrastructure) via allowlist profiles. Allowlisting private addresses or large CIDRs is logged with action "note" in the audit trail.

Dual-stack (IPv4 + IPv6)

The firewall operates in dual-stack mode using nftables inet tables, which handle both IPv4 and IPv6 within the same ruleset. Two parallel allow sets are maintained:

  • allow_v4 (type ipv4_addr; flags interval;) — IPv4 allowlist
  • allow_v6 (type ipv6_addr; flags interval;) — IPv6 allowlist

DNS resolution queries both A and AAAA records, and resolved addresses are automatically routed to the correct set. Private ranges are rejected with family-specific ICMP errors:

  • All private ranges use cross-family reject with icmpx admin-prohibited
  • Default deny also uses reject with icmpx admin-prohibited

Fail-closed guarantees

After pre_start() installs hooks, this invariant holds: no path from "firewall setup failed" to "container running unrestricted."

Failure Result
OCI hook raises an exception Container torn down by podman (non-zero exit)
nft binary missing ExecError → hook exits non-zero → torn down
state_dir annotation missing Hook exits non-zero → torn down
Bundle version mismatch Hook exits non-zero → torn down
Allowlist file unreadable Hook exits non-zero → torn down
Ruleset fails to load Hook exits → torn down
Self-verification fails Hook exits → torn down

The fail-closed guarantee applies once hooks are installed by pre_start(). Use terok-shield run (or call pre_start() via the Python API) before starting containers — without hooks, containers start without firewall rules and no egress filtering is applied.