Cost Efficiency

Feeding massive documentation files to your AI assistant is expensive. Engrams CLI solves this by allowing agents to query and retrieve only the precise snippets they need for the active task.

The "$1 per MB" Problem

Large language models charge per token processed. Storing all your project guidelines, architecture choices, and coding standards in a static README.md or AGENTS.md file means pasting that entire payload into every prompt.

As your team logs more decisions, this static context grows:

50 Decisions + 15 Patterns: ~80KB of text.
Token Cost: ~20K tokens. At typical pricing (e.g., Claude 3.5 Sonnet input at $3 per million tokens), this adds $0.06 per request.
Daily Expense: Over a developer's day (100 requests), this burns $6.00/day ($120/month) just on repeating the same static rules!

The Engrams Solution: On-Demand Retrieval

Instead of dumping the entire knowledge base into the prompt, the AI assistant queries the local database on-demand using the CLI. It only loads matching items into the context window.

Traditional Bloated Prompt (Expensive)

You: "Add a /login route."

AI: [Pasts entire 50-item AGENTS.md document into system prompt]
    - Database conventions
    - Deployment guidelines
    - Caching settings
    - CSS style guide
    - auth patterns (actually relevant)

    Input: 25,000 tokens ($0.075)

Engrams CLI Query (Cost-Efficient)

You: "Add a /login route."

AI: [Runs: engrams decision search "auth"]
    - Decision #7: Use JWT tokens for auth (Only 1 matching ADR returned)

    Input: 500 tokens ($0.0015)

Practical Cost-Saving Techniques

Engrams provides several built-in subcommands to help your AI fetch context selectively:

1. Tag-Based Filtering

Rather than retrieving all decisions, the AI can query only items matching specific tags:

engrams decision list --tags db,sql

2. Keyword Search

Using SQLite FTS5 search, the agent can narrow down to specific keywords instantly:

engrams custom search "redis_host"

3. Recent Activity Digest

Instead of loading the entire project history, agents can fetch a quick digest of what changed in the last 24 hours to get up to speed:

engrams activity --hours 24

4. Budgeted Briefing (`prime`)

The prime command generates a deterministic project briefing fanning across product context, active context, recent decisions, patterns, and progress. It accepts a --budget limit in tokens, and automatically drops lower-priority sections (while preserving the hand-off active context) until the payload fits the budget:

engrams prime --budget 1000

5. Compact JSON Output (`--compact`)

Pass the global --compact flag to strip all null values from JSON results and print them on a single line (no pretty-printing whitespace), saving significant token space:

engrams --compact decision list

Summary

By transitioning from static context files to the Engrams CLI, you reduce the AI's active context window by 80% to 95%, directly lowering API costs while improving the agent's focus and accuracy.