Investigation Demo | TokenGoblin

STEP 1

Monday morning. We ran a forensic report. Total spend was up.

The support_reply workflow runs hundreds of times a day powering a production chatbot. A quick GET /v1/forensic-report?period=last-7d returned a complete forensic report with classification, hypotheses, evidence log, and attribution — no dashboard, no configuration.

forensic report — markdown

STEP 2

The attribution summary told us exactly which deploy and user drove the spike.

TokenGoblin automatically captured the deploy SHA from the CI environment. The forensic report shows that all anomalous events share deploy_sha = 8f3a2b1c, deployed Thursday evening. The first anomaly appeared 10 hours later — a textbook deploy-correlated regression.

The user breakdown showed user_789 accounted for $145.20 of the $203.07 delta — over 70% of the increase. This data is populated automatically by the SDK provider functions. No manual tracking. No custom headers.

STEP 3

We checked the `summarize_context` step. The forensic report was right.

Pulled up the workflow definition. A commit from last Thursday had bumped max_tokens from 2000 to 4000. Twice the tokens meant twice the cost per call. The forensic report's top hypothesis was spot on — input tokens per call had exploded 60,000×. Git blame found the exact line. Deploy SHA correlation confirmed the timing.

support_reply.py — commit 8f3a2b1 (last Thursday)

# Step: summarize_context
with goblin.step("summarize_context") as step:
    response = openai.chat.completions.create(
        model="gpt-4.1-mini",
        messages=context_messages,
        # max_tokens changed from 2000 to 4000 last Thursday
        max_tokens=4000,  <-- root cause
    )
    step.record_usage(
        input_tokens=response.usage.prompt_tokens,
        output_tokens=response.usage.completion_tokens,
        cost_usd=compute_cost(response.usage, "gpt-4.1-mini"),
    )

STEP 4

Reverted the config. Verified the fix with a fresh forensic report.

Changed max_tokens back to 2000, deployed the fix. Ran GET /v1/forensic-report?period=last-24h the next day. summarize_context cost returned to baseline. The report showed unknown classification — no significant drift. Incident closed.

terminal — post-fix verification

✓ Root cause found. Fix deployed. Incident closed.

BEHIND THE SCENES

How attribution works — no manual headers needed.

The deploy SHA and user/tenant data you saw in the report were captured automatically. Here's how:

Deploy SHA — automatic from your CI environment

The SDK checks 8 common CI/CD environment variables in priority order: GITHUB_SHA, CI_COMMIT_SHA, BITBUCKET_COMMIT, CIRCLE_SHA1, GIT_COMMIT, BUILD_SOURCEVERSION, TRAVIS_COMMIT, and CF_REVISION. Falls back to git rev-parse HEAD. No code or headers required — your deploy SHA appears in forensic reports automatically.

User / tenant context — set providers once in the SDK

Pass optional user_id_provider and tenant_id_provider callables when initializing TokenGoblin. They're called once per step() invocation and their return values are attached to every event.

Python SDK — one-time setup, automatic attribution

from tokengoblin import TokenGoblin

goblin = TokenGoblin(
    api_key="tgproj_...",
    workflow_name="support_reply",

    // deploy_sha — auto from CI/CD env vars, no code needed

    // user / tenant — one-time callable setup
    user_id_provider=lambda: request.user.id,
    tenant_id_provider=lambda: request.tenant.id,
)

with goblin.step("summarize_context") as step:
    step.record_usage(
        input_tokens=7600,
        output_tokens=420,
        cost_usd="0.00443",
    )

That's it. Every event is tagged automatically, and the forensic report surfaces the top users/tenants by cost delta. No manual tracking. No extra API calls.

How TokenGoblin finds the root cause.

Monday morning. We ran a forensic report. Total spend was up.

The attribution summary told us exactly which deploy and user drove the spike.

We checked the `summarize_context` step. The forensic report was right.

Reverted the config. Verified the fix with a fresh forensic report.

How attribution works — no manual headers needed.

Deploy SHA — automatic from your CI environment

User / tenant context — set providers once in the SDK

Investigate your own cost regressions — ranked hypotheses included.

Monday morning. We ran a forensic report. Total spend was up.

The attribution summary told us exactly which deploy and user drove the spike.

We checked the summarize_context step. The forensic report was right.

Reverted the config. Verified the fix with a fresh forensic report.

How attribution works — no manual headers needed.

Deploy SHA — automatic from your CI environment

User / tenant context — set providers once in the SDK

Investigate your own cost regressions — ranked hypotheses included.

We checked the `summarize_context` step. The forensic report was right.