Swiftbeard

CLI-Anything: Making Software Agent-Native

CLI-Anything wraps any software in a CLI interface for agents — what agent-native software means and why the CLI is the natural interface for AI.

cliagentsdeveloper-toolsautomation

Software wasn't designed for AI agents. It was designed for humans — with GUIs, menus, confirmation dialogs, and visual feedback. Agents can't click menus. They need text-based, scriptable interfaces with predictable input/output contracts.

CLI-Anything is a tool that wraps arbitrary software in a CLI interface that agents can actually use. It's a hack, but it's a revealing one.

The Agent Interface Problem

Here's the problem in concrete terms. Say you want an AI agent to file a bug report in Jira. Options:

  1. Use the Jira API — Works, but you need to build a tool wrapper and handle auth
  2. Use Playwright/browser automation — Brittle, breaks when the UI updates
  3. Use the Jira CLI — Great, if Jira had a good CLI (it doesn't)
  4. Use CLI-Anything to wrap Jira — Creates a scriptable interface from the web UI

CLI-Anything sits in the middle: it wraps browser-based software in a CLI by combining browser automation with a defined schema for what operations are available.

# CLI-Anything wraps a web app
cli-anything define jira \
  --url "https://yourorg.atlassian.net" \
  --operations create-issue,list-issues,update-issue

# Now agents can call it
cli-anything run jira create-issue \
  --project "BACKEND" \
  --type "Bug" \
  --title "Payment webhook fails on retry"

Under the hood, CLI-Anything uses a browser automation layer to actually click through the UI. The CLI is the stable interface; the browser automation is the implementation.

Why CLI Is the Natural Agent Interface

This isn't arbitrary. The CLI has properties that make it ideal for agent use:

Text in, text out: Agents operate in tokens. CLI tools return plain text or structured output (JSON, CSV). No parsing GUI state, no interpreting visual layouts.

Composability: Unix pipe philosophy — chain operations together. An agent can list-issues | filter | create-summary as naturally as a shell script.

Deterministic: Given the same command, you get the same behavior. No UI variation, no A/B tests, no modal dialogs that appear sometimes.

Inspectable: The command history is a log. You can audit what an agent did exactly.

Greppable: Output is searchable. Agents can parse, filter, and transform CLI output naturally.

What Agent-Native Software Looks Like

CLI-Anything is patching software that wasn't designed for agents. But the more interesting question is: what does software look like when it's designed for agents from the start?

A few patterns that make software agent-native:

Structured output by default: Return JSON when called non-interactively. ls --json instead of column-formatted text.

Machine-readable errors: Error messages with error codes, not just prose strings. {"error": "NOT_FOUND", "resource": "issue", "id": "BACK-123"} not "Error: could not find that issue."

Operation schemas: Document available operations in a machine-readable format (OpenAPI, JSON Schema) so agents can discover capabilities without reading docs.

No confirmation dialogs: Or make them opt-in with a --interactive flag. Agents can't click "Are you sure? [Y/n]".

Idempotency: Agents retry. Operations that run twice should produce the same result.

The Bigger Point

CLI-Anything is useful today for bridging the gap between existing software and AI agents. But the long-term shift is more fundamental: software designed in an agent-first world will have programmable interfaces as the primary interface, with human-friendly UIs layered on top — not the other way around.

The API-first movement in the 2010s was "design the API first, the UI is secondary." The agent-native equivalent is "design the CLI and tool schema first, the GUI is secondary."

We're in the early days of this, but the direction is clear.