weekly prompts 6 | token math isn’t the whole math

about paying for a moving target

jun 07, 2026

Narração de artigo

0:00

-13:31

some ppl on reddit claim claude’s subscription has a better effective $/token rate than the pay-as-you-go API. i wanted to see it with my own eyes.

so this post is two things: a little rant about what we’re actually paying for, and then the savvy cheap-but-still-high-quality experiment i ran: claude code pointed at deepseek’s anthropic-compatible endpoint, max effort, 1M context.

this is a tech section in experimental’s newsletter.

if you prefer follow weekly prompts on linkedin & unfollow this section here on substack.

the apple parallel

apple taught us the shape of the trick: your phone gets weird and slow, you update, suddenly it’s fine again, or you just buy the new one.

the motive’s contested (battery protection vs. planned obsolescence), nobody outside cupertino really knows, but the effect is undeniable, and it’s an inescapable business model you’re forced to participate in.

AI providers are doing a version of this, except clumsier and more public: users keep reporting claude or GPT “feels dumber,” and the companies eventually admit they tweaked an effort default, changed a system prompt, shipped a verbosity fix that broke reasoning.

fine. whatever. who cares.

except, we never get to verify any of it, and we have no power to not upgrade, the thing we’re paying for just stops working the way it did yesterday.

the reality is that they need us on the new models….

you pay, and you’re the product

we’re a decade into social media, so we know the rule:

if you’re not paying for something, you’re the product.

but the social media version is that you pay with attention (aka your time of life) and in exchange you get reach, connections, a whole bunch of fun new things…

with AI you hand over your hours for productivity, digested knowledge, automations (things that’ll keep changing too, the way feeds did),

so, you pay with attention, money, and you’re the product.

every prompt feeds the system that trains the next model and you’re paying for it.

so, two things we already recognize:

the product decays and you pay to “upgrade,” like the apple cycle
they hook you into using it more so they harvest your data, your behavior, your preferences, your weird specific use cases

and two things that are new:

when you upgrade, you apparently get less. the “smarter” model is more expensive to run, so your subscription burns through its limits faster. you pay more to chat less.

unlike a phone or a feed, the product is non-deterministic. you can’t audit it. there’s no fixed thing to compare against , so the “$/token” question i started with has no honest answer. which is exactly how a moving target stays a moving target.

that’s the real reality check, now whether our expectations are aligned…

let’s talk configs and tokens.

tokens are the wrong unit

nobody buys tokens. people buy hours of work. tokens are just the meter running underneath, what you actually want is a feature shipped, a bug fixed, an essay written.

and the two billing models don’t just cost different amounts. they behave differently:

on the subscription model, max effort costs nothing extra, you just hit your rate limit faster, wait for the window to reset and you’re back. annoying, but “free.”
on pay-per-token, you don’t hit a wall. you just bleed credits. no waiting, no reset… reload your balance or stop working.

the experiment: swapping claude code’s brain

claude code is anthropic’s CLI/agent tool, by default it talks to

https://api.anthropic.com

and runs claude models.

deepseek’s “anthropic-compatible endpoint” is just an API at https://api.deepseek.com/anthropic that speaks the same protocol, same request format, same response shape, same auth pattern. so any client built for anthropic’s API (claude code included) can talk to it without code changes.

the swap is done with env vars: claude code reads them, sends its requests to deepseek instead, and never knows the difference.

same agent loop, same tool-use logic, same VS Code integration, but a different brain doing the thinking. (same pattern as pointing an openai SDK at a local ollama server, or at groq, for example)

here’s the config:

#!/usr/bin/env sh

export ANTHROPIC_BASE_URL="https://api.deepseek.com/anthropic"
export ANTHROPIC_AUTH_TOKEN="<your-deepseek-api-key>"
export ANTHROPIC_MODEL="deepseek-v4-pro[1m]"
export ANTHROPIC_DEFAULT_OPUS_MODEL="deepseek-v4-pro[1m]"
export ANTHROPIC_DEFAULT_SONNET_MODEL="deepseek-v4-pro[1m]"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="deepseek-v4-flash[1m]"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC="1"
export CLAUDE_CODE_EFFORT_LEVEL="max"

rememebr it’s pay-per-token, so you must buy some Deepseek credits.

(tip: run /status inside claude code after launching to confirm the base URL and model actually switched.)

CLAUDE_CODE_EFFORT_LEVEL=max tells the model to use maximum reasoning tokens per turn and combine that with deepseek-v4-pro[1m] it’s the best model you can get to think about long context tasks (many pages pdfs, large code database, etc). So each turn burns significantly more billable tokens than a “normal” call.

TL;DR max effort + 1M context = the most expensive config you can run.

the math, honestly

i tested it on a small session: 4 hours of coding, ~$5 burned on deepseek pay-per-token at max effort + 1M context.

~$5 for 4 hours = ~$1.25/hour
hour 20: 20 × $1.25 = $25 burned
40 hrs/week × 4 weeks ≈ ~$200/month, climbing

meanwhile the sub is flat at ~$100/month no matter how many hours i log, so somewhere around hour 20, the pay-per-token bill crosses the sub price, and from there every extra hour just widens the gap.

so yeah, reddit was right. como a gente diz no brasil: a voz do povo é a voz de deus kk

but.

the product underneath you shifts week to week, and it’s non-deterministic, there’s nothing fixed to measure against. subscription or pay-as-you-go, you’re paying for a moving target, you can cry on twitter about slop models, but it’s about as effective as crying over gravity.

and here’s the real math: those “extra” hours the sub keeps handing you aren’t generosity, they decide how fast you burn, how many hours you get, they hand you more hours because more usage is more data, and the data trains the next model you’ll pay to upgrade to.

you’re using the product, sure, but mostly the product is using you, you can’t audit a thing that’s being moved on purpose in the direction that extracts the most from you.

so yeah, that’s the actual math.

til next “weekly prompts”,

subscribe for more (:

Best regards,

Mica

Experimental

Discussão sobre este post

Pronto para mais?