Trace every AI cost to the feature, customer, or workflow behind it.
Every LLM call mapped to the customer it served.
Every customer mapped to the revenue they pay.
AI gross margin per customer — finally.
No credit card · open-source SDK · 5-minute install
Your AI bill went from $300 to $4,000 in six months.
You can't tell your CFO which customers earned it back and which are bleeding you dry. Aggregate spend hides the leak.
Your CFO asks: which customers are profitable on AI?
Your dashboard says "OpenAI: $4,012/mo." That's not an answer. You need cost per customer, joined to the revenue they pay.
Your bill jumps 40% and no one can explain why
Was it a new feature, a chatty customer, a retry loop, or a model you forgot to downgrade? Aggregate spend hides the leak.
Engineering says "swap to gpt-4o-mini" — but is it safe?
You need to know quality holds before you flip the switch in prod. "Trust me" is not a CFO-grade answer.
How it works
Three steps from npm install to your first dashboard.
Install
Add llm-cost-meter to your Node, Next.js, or worker project. Zero config to get started.
npm install llm-cost-meter
Configure
Drop in your workspace API key and pass a tenantId per call. The CloudAdapter streams events off-thread.
await track(
{ feature: "chat",
tenantId: customer.id },
() => openai.chat.completions
.create({ ... })
);See & act
Slice spend by feature, customer, and run. Apply validated swaps. Catch leaks before the invoice.
→ visit /dashboard
What's in the box
Four capabilities, one cohesive product. None of them are gateways.
Per-customer AI margin
Every LLM call tagged with a tenantId, joined to the revenue that customer pays. See who's profitable on AI and who's a subsidy in disguise.
Agent & workflow economics
Cost per run, cost per tool step, retry waste. Built for multi-step agents — not just per-call dashboards.
Cost-validated savings
We replay sampled events on cheaper candidate models, score quality with a judge, and only recommend swaps where quality holds. Then we measure whether you actually applied them.
Budgets & advisory enforcement
Per-feature, per-customer, per-workspace budgets with Slack/email alerts. SDK-side downgrade or block before the call fires — no gateway in your request path.
We're deliberate about non-goals. See what trAIce doesn't do →
Built for
Three people, three different jobs, same dashboard.
Find your unprofitable customer in an hour.
Map tenantId → Stripe MRR. See AI gross margin per customer. Stop subsidizing the heavy user paying entry-tier pricing.
Cost per run, not just cost per call.
Multi-step agents need workflow-level cost. Top runs by spend, retry waste, cost by tool. Built for LangGraph-shaped systems.
AI gross margin a CFO can sign off on.
Imported revenue per customer joined to LLM cost. A blended margin number, top margin leaks ranked, exportable for the board deck.
trAIce is the only tool that ties AI cost to customer revenue.
Gateways route. Evals score. Observability traces. None of them tell you which customer is unprofitable on AI. We do — and we deliberately don't do their jobs.
| What you want | trAIce | AI gateway (Portkey, LiteLLM) | Evals (Braintrust, LangSmith) | Observability (Datadog, Helicone) |
|---|---|---|---|---|
| AI cost per customer / tenant | ✓ | — | — | Partial |
| AI gross margin (cost vs revenue) | ✓ | — | — | — |
| Cost per agent run / tool step | ✓ | Partial | — | Partial |
| Cost-validated model swap recommendations | ✓ | — | Partial (quality only) | — |
| Budgets, alerts, advisory enforcement | ✓ | ✓ (gateway side) | — | — |
| In-request gateway / routing | ✗ (non-goal) | ✓ | — | — |
| Prompts playground / versioning | ✗ (non-goal) | — | ✓ | — |
| Full distributed tracing of LLM apps | ✗ (non-goal) | — | — | ✓ |
Full list: /what-we-dont-do.
Frequently asked
Why not just use Datadog / Helicone / what I already pay for?
They tell you AI cost; we tell you AI margin. Observability tools show spend by model, route, latency. None of them join your LLM calls to the customer who pays you. Margin is the answer your CFO actually wants.
Do you sit in my request path?
No. The SDK fires off-thread events to trAIce after your LLM call. Pre-call advisory enforcement is opt-in and SDK-side — your code chooses whether to honor it.
What do you do with my prompts?
By default, prompt text never leaves your infrastructure. The optional Cost-Validated Savings feature stores sampled prompt+output for replay — explicit per-workspace opt-in, 14-day TTL, deleted on opt-out.
How does this compare to Braintrust / LangSmith?
They're evals platforms; we're a cost + margin product. We adopt the narrowest possible eval primitives only to power cost-validated savings — no datasets, no playground, no autoraters library. See /what-we-dont-do.
Is there a Python SDK?
Node-only today. Python is the next priority because half the AI-engineering audience needs it. In the meantime the cloud accepts events over plain HTTP — curl or requests works fine.
How accurate are the savings recommendations?
We require ≥30 sampled events at ≥90% equivalence before showing a recommendation, and we label confidence (green / yellow). We measure adoption: when you apply a swap, we auto-detect whether your event stream actually moved — and revert the status if it flipped back.
What's free?
1,000 events/month forever, no credit card. Starter at $9/mo for 10K events. Pro at $29/mo for 100K. Team at $99/mo for 1M.
Can I export my data?
Yes. Settings → Account → Export gives you a full NDJSON dump of your workspace (events, keys, budgets, revenue rows). Delete your account and we cascade everything.
Find your money-losing customer in 10 minutes.
Open the live demo — no signup. See exactly how trAIce surfaces the customer who's costing you more in AI than they pay you in MRR.
1,000 events/month free, no credit card.