MoleAPIMoleAPI
DocumentationQuick StartBasics

Per-Request Billing

Some models are billed directly per request, which is suitable for scenarios where the cost of a single call is clear

Per-request billing means that once a request is sent, it is settled as a single call, rather than being billed primarily in a linear way based on input/output text units like text models.

The Best Way to Understand It

You can think of it as:

  • It does not primarily focus on how much text you input
  • It does not primarily focus on how much text the model outputs
  • Instead, it focuses on the billing rule associated with “the request itself”

This approach is more common in image generation, video generation, and some specialized generation tasks.

Which Scenarios Are More Likely to Use Per-Request Billing

The most common ones currently are:

  • Image generation
  • Video generation
  • Some fixed-action APIs

Whether a model uses per-request billing ultimately depends on the specific model or what is shown in the console.

Benefits of Per-Request Billing

1. Easier to Estimate the Cost of a Single Call

If your use case involves low-frequency calls, per-request billing is more intuitive:

  • Roughly how much one call costs
  • Roughly how much it costs to generate one image
  • Roughly how much it costs to generate one video

2. Better for Business-Side Budgeting

For some tasks where text-based billing consumption is not easy to estimate, per-request billing makes it easier to quote prices directly or control budgets.

When Per-Request Billing Is a Good Fit

  • Low-frequency calls
  • Image/video generation tasks
  • When you want to understand the cost per call at a glance
  • When you care more about “how much each action costs” rather than “how much each text unit costs”

Usage Recommendations

If you mainly work with:

  • Conversations
  • Text generation
  • Embeddings
  • Long-text processing

Then what you usually need to focus on is not per-request billing, but rather usage-based billing.

One-Sentence Rule of Thumb

Generation tasks such as images and videos are more commonly billed per request, while text LLM calls are more commonly billed based on usage.

How is this guide?

Last updated on

Back HomeGateway