Per-Request Billing

Some models are billed directly per request, which is suitable for scenarios where the cost of a single call is clear

Per-request billing means that once a request is sent, it is settled as a single call, rather than being billed primarily in a linear way based on input/output text units like text models.

The Best Way to Understand It

You can think of it as:

It does not primarily focus on how much text you input
It does not primarily focus on how much text the model outputs
Instead, it focuses on the billing rule associated with “the request itself”

This approach is more common in image generation, video generation, and some specialized generation tasks.

Which Scenarios Are More Likely to Use Per-Request Billing

The most common ones currently are:

Image generation
Video generation
Some fixed-action APIs

Whether a model uses per-request billing ultimately depends on the specific model or what is shown in the console.

Benefits of Per-Request Billing

1. Easier to Estimate the Cost of a Single Call

If your use case involves low-frequency calls, per-request billing is more intuitive:

Roughly how much one call costs
Roughly how much it costs to generate one image
Roughly how much it costs to generate one video

2. Better for Business-Side Budgeting

For some tasks where text-based billing consumption is not easy to estimate, per-request billing makes it easier to quote prices directly or control budgets.

When Per-Request Billing Is a Good Fit

Low-frequency calls
Image/video generation tasks
When you want to understand the cost per call at a glance
When you care more about “how much each action costs” rather than “how much each text unit costs”

Usage Recommendations

If you mainly work with:

Conversations
Text generation
Embeddings
Long-text processing

Then what you usually need to focus on is not per-request billing, but rather usage-based billing.

One-Sentence Rule of Thumb

Generation tasks such as images and videos are more commonly billed per request, while text LLM calls are more commonly billed based on usage.