“Heavy users” are a recurring issue for any computing service using fixed rate billing.
For consumer or enterprise users, a general rule of thumb is that customers prefer predictable billing, and hence set rates. That might not be an issue if usage is predictable and moderate over time.
But fixed rate billing can undermine a business model if usage and costs to provision are highly variable (marginal cost is high).
On the other hand, fixed costs encourage sampling and experimentation, encouraging market growth.
But there are customer hurdles when usage-based pricing is used. For suppliers, there is:
Revenue volatility
Usage spikes or dips can meaningfully swing revenue
Sales friction.
For buyers, the issues include”
Bill shocks
Complex buying process at scale
Loss of cost prediction.
Variable costs match “value” and “cost to supply,” as heavier users pay more. But that comes at the price of cost predictability, which tends to limit adoption.
Fixed cost pricing (generally with usage allowances) might include:
Per-seat: $X per user/month
Per-account: $Y per workspace or tenant/month
Tiered plans: Basic, Pro, Enterprise (cost increases with tier)
Metered or usage-based billing might feature prices based on:
Per API call: $0.001 per request
Per token: $/1,000 tokens in or out
Per compute hour: $/GPU-hour or vCPU-hour
Per output unit: per generated document, per image, per transcription minute.
The perhaps-obvious compromise are hybrid models including both fixed cost and variable cost charging. For suppliers and customers alike, this provides some predictability of revenue or cost; plus some ability to match usage volume with cost.
For most suppliers, usage-based billing makes most sense when:
You supply infrastructure
Usage and costs are highly variable
Customers are developers and customers experimenting
Cash and margin discipline are top of mind.
Fixed pricing will tend to work better when:
AI is a feature in a broader SaaS product
Usage per customer is relatively consistent or capped
Buyers want simple prices for yearly contracts
Model and cloud costs are low and covered at the plan price.
Hybrid appeals when:
You want predictable baseline revenue and upside from power users
You have meaningful unit costs but strong sales
You sell into finance or procurement buyers and need to protect margin.
Hybrid is generally preferable, if only because AI inference costs are highly variable and scale directly with usage.
Also, as a practical matter, hybrid helps assure some level of ability to plan for recurring revenue magnitudes, per account or per user.
And most suppliers will sell to consumers, developers, smaller and enterprise-sized customers, some preferring low entry costs; others preferring cost predictability at scale.
And a reasonable assumption is that usage will climb over time, for virtually every customer set.
All that noted, many prior computing products have shifted from flat rate to metered, or metered to flat rate. The shift to cloud computing involved another change from “own” to “rent,” with the emergence of subscriptions (flat rate, generally) and end of perpetual license (one payment, upfront).
And where “subscription” is the product payment structure, flat rate tends to be preferred, where possible. That tends to be the case where costs are not highly variable.
The issue with AI inference is simply that costs are fairly linear with usage.
Also, there should eventually be a stronger trend towards "billable units" that align with standard business workflows. In healthcare, that might be cost per-patient, per-test, or per-episode.
In other businesses, perhaps the shift is to per-user and per-task metrics. There also is much talk of pricing based on outcomes or value, but that always requires clear metrics, always a difficult task.
No comments:
Post a Comment