Efficiency

Why typed capabilities beat shell tools.

For AI agents, typed APIs are 3–10× more token-efficient than shell parsing, with better error recovery, audit by default, and operator-supervisable policy gates.

01 · Token efficiency

Same operation. Half the tokens.

An agent calling a typed capability spends fewer tokens describing what it wants, gets a structured response back, and doesn't have to grep its own stdout. Numbers below are illustrative — empirical ranges from our v0.0.1 traces.

A

Read a JSON file

/workspace/config.json
⊘ Shell tool — Bash parse, hope, retry
# Agent emits: { "tool": "bash", "command": "cat /workspace/config.json" } # Agent receives (stdout, untyped): {"model":"claude-3-5","timeout":30,...} # Then must parse, detect JSON vs error, # handle "file not found" string by reading # stderr separately, etc.
Input
~140 tok
Output
~180 tok
Error recovery
~220 tok
✓ KruxOS — typed capability structured both ways
# Agent emits: filesystem.read({ path: "/workspace/config.json" }) # Agent receives (typed result): { content: {...}, bytes: 412, mtime: "2026-05-12T08:14:02Z" } # On miss: typed error {code, path, hint}
Input
~38 tok
Output
~80 tok
Error recovery
~45 tok
B

HTTP POST with JSON response parsing

/api/agents/status
⊘ Shell tool — curl + jq three tools, one job
curl -X POST \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $TOK" \ -d '{"agent_id":"a-42"}' \ https://api.example.com/agents/status \ | jq '.health' # Status code? Lost. Headers? Lost. # Bad JSON? jq crashes. Retry logic? # Agent writes it inline, every time.
Input
~180 tok
Output
~200 tok
Error recovery
~280 tok
✓ KruxOS — typed capability one call, full envelope
network.http_request({ method: "POST", url: "https://api.example.com/agents/status", json: { agent_id: "a-42" }, secret_header: { Authorization: "Bearer ${api_token}" } }) # Returns: {status, headers, json, latency_ms}
Input
~52 tok
Output
~95 tok
Error recovery
~60 tok
C

Run a process with arguments

rm -rf /tmp/build
⊘ Shell tool — bash -c string templating into rm
bash -c "rm -rf /tmp/build" # Quoting? Whitespace in paths? # Injection if $path is user-derived? # Whose problem? The agent's prompt. # No structured exit code — only stdout/err # strings the agent must re-parse.
Input
~110 tok
Output
~70 tok
Error recovery
~160 tok
✓ KruxOS — typed capability argv array, no shell expansion
process.run({ command: "rm", args: ["-rf", "/tmp/build"] }) # Returns: {exit_code, stdout, stderr, # duration_ms, killed} # Policy checked argv BEFORE fork — "rm -rf /" # would block without burning a single syscall.
Input
~32 tok
Output
~70 tok
Error recovery
~40 tok
3.7× Avg input token reduction across all three examples.
2.4× Avg output token reduction. Typed responses don't carry framing.
4.9× Avg error-recovery token reduction. Structured errors don't need parsing.
0 LLM tokens spent on policy decisions. The gate runs deterministically.
02 · Safety

Built-in safety vs. bolt-on.

Shell tools are powerful because they trust the caller. KruxOS capabilities are powerful because they don't have to.

Destructive operations

The classic "agent typed the wrong path" failure mode.

⊘ Shell
rm -rf /important/dir

Gone. No trash, no recovery, no audit. The agent doesn't know it broke anything until the next test fails.

✓ KruxOS
filesystem.delete({ path: "/important/dir" })

Soft-deletes to per-principal trash. 168h retention for User, 24h for Agent. Recover with filesystem.restore().

Bulk operations

Where one mistake becomes thousands.

⊘ Shell
cat ./contacts.csv | mailx -s "Update" all@

Fires immediately. No cancellation window. Hits the SMTP server before the operator has read the agent's plan.

✓ KruxOS
email.bulk_delete({ count: 1247 })

approval_required tier by default. Or email.send to lists is write-buffered with a 30s cancellation window.

Secret handling

The most common way agents leak.

⊘ Shell
export API_KEY=... && curl -H "Authorization: $API_KEY"

Secret lives in the agent's env, its shell history, its logs, and its next prompt. Any of those leak the lot.

✓ KruxOS
secrets.use({ secret_id: "api_token_v1" })

Vault-backed, AES-256-GCM, use-not-read. The raw value never crosses the agent boundary — the gateway substitutes it server-side.

03 · Operator visibility

What the supervisor actually sees.

Audit isn't a feature you bolt on after the agent runs amok. It's the substrate.

⊘ Shell-tool agent

Whatever the agent decided to log.

  • No visibility into what the agent did unless logging was instrumented up-front.
  • Stdout/stderr only — if the agent suppressed output, it's gone.
  • Tampering surface = anything the agent could touch (which is everything).
  • No replay. No diff. No timeline.
✓ KruxOS

Every call. Tamper-evident.

  • Every capability call hash-chained into the audit log — CBOR, append-only.
  • Per-principal actor: which agent, which user, which session, which policy tier.
  • Bounded ring-buffer in memory + disk-full retry. Daily rotation, 90-day retention.
  • SIEM streaming on Enterprise. Replay tool ships v0.0.2.
04 · When not to use KruxOS

When a shell is still the right answer.

A policy gate is overhead. Sometimes the overhead isn't worth it — and we'd rather you know than learn the hard way.

01

One-off human-driven scripts.

You're not running make or cargo build through a typed capability. The cost of writing a policy for a thing you'll never audit is higher than the risk it mitigates.

02

Hot-path sysadmin tooling.

Some bespoke shell pipelines have been tuned for years. A typed capability is wider, friendlier, and slower. If your tooling outperforms what we'd ship — keep it.

03

Anything humans do directly.

Use KruxOS for what an AI is doing. The supervision surface, the audit surface, the policy surface — they all assume the actor is an agent. If the actor is you, you don't need them.

Sized right, KruxOS doesn't replace your shell. It replaces the shell your agent would have used — and gives you the supervisor's chair.

Download v0.0.1 → Read the docs →