← d3dev

Cursor "Included-Request Usage" on Enterprise Plans

At a Glance

Cursor's "included-request usage" is not a count of requests — it's a dollar amount of API token consumption included in your plan. Each team seat ($40/mo) includes $20/mo of usage charged at model API rates per million tokens. Enterprise plans offer pooled usage (shared across the whole team) instead of per-user allocation. The single biggest lever to minimize usage is model selection: Claude 4.6 Opus output costs 8.3x more per token than Gemini 3 Flash.

Cursor "Included-Request Usage" on Enterprise Plans

Metadata

Date: 2026-02-26
Topic: Cursor usage-based pricing, enterprise request usage, minimization strategies
Source: Cursor official documentation (cursor.com/docs)
Tags: cursor, pricing, enterprise, usage, tokens, optimization

At a Glance

Quotes

"Each plan includes usage charged at model inference API prices."
— Cursor Docs, Pricing [1]

"We work hard to grant additional bonus capacity beyond the guaranteed included usage. Since different models have different API costs, your model selection affects token output and how quickly your included usage is consumed."
— Cursor Docs, Pricing [1]

"All non-Auto agent requests include a $0.25 per million tokens fee. This covers: Semantic search, Custom model execution (Tab, Apply, etc.), Infrastructure and processing costs."
— Cursor Docs, Team Pricing [2]

"Max Mode uses token-based pricing at the model's API rate plus a 20% upcharge."
— Cursor Docs, Models [3]

"Member spend limits on Enterprise pooled usage accounts apply to total usage, not just on-demand usage."
— Cursor Docs, Spend Limits [4]

Sam's TLDR;

"Request usage" = dollars of tokens consumed, not number of requests. Your plan includes a dollar budget that gets eaten by every agent interaction at the model's per-token API rate. Expensive models (Opus) burn through it fast; cheap models (Gemini Flash, Grok Code) stretch it. Enterprise's pooled usage is the key differentiator — your team shares one big bucket instead of each person having a tiny $20 bucket. To minimize: pick cheaper models for routine work, avoid Max Mode, keep conversations short, and use Auto mode for balanced cost/quality.

Key Points

What "Included Usage" Actually Is

Usage is measured in dollars, calculated from tokens consumed × model API price per million tokens [1]
Teams plan: each seat ($40/mo) includes $20/mo of included usage [2]
Enterprise plan: custom pricing with pooled usage shared across all team members [2]
Usage resets each billing cycle and does not roll over [2]
When included usage is exhausted, on-demand usage kicks in at the same rates, billed in arrears [2]

How Usage Is Calculated

Every interaction with the AI consumes tokens across four categories, each priced differently [3]:

Token Type	Description	Relative Cost
Input	Your prompt, attached files, context	Base rate
Cache Write	First-time context processing	1.25x input rate
Cache Read	Re-used context from prior turns	~10% of input rate
Output	Model's response (code, text)	3-5x input rate

On Teams/Enterprise, there's also a Cursor Token Fee of $0.25 per million tokens (all token types) on non-Auto requests, covering semantic search, Apply, and infrastructure [2].

Model Pricing Comparison (per 1M tokens)

Model	Input	Cache Read	Output	Cost Ratio vs Flash
Claude 4.6 Opus	$5.00	$0.50	$25.00	8.3x
Claude 4.6 Sonnet	$3.00	$0.30	$15.00	5x
Composer 1.5	$3.50	$0.35	$17.50	5.8x
Gemini 3.1 Pro	$2.00	$0.20	$12.00	4x
GPT-5.2 / 5.3 Codex	$1.75	$0.175	$14.00	4.7x
Auto mode	$1.25	$0.25	$6.00	2x
Gemini 3 Flash	$0.50	$0.05	$3.00	1x (cheapest)
Grok Code	$0.20	$0.02	$1.50	0.5x (cheapest)

Enterprise-Specific Features

Pooled usage: All team members share one usage bucket instead of individual $20 allocations [2]. High-usage power users are balanced by lower-usage team members.
Per-member spend limits: Admins can set individual spending caps, not available on Teams plan [4]
Group-level spend overrides: Set different limits for different teams/departments [4]
Billing groups: Track and report usage by department for chargebacks [5]
Admin API: Programmatically manage spend limits and query per-user spending data [2]
Model access restrictions: Lock down which models users can access (prevent expensive Opus usage org-wide) [5]

Typical Usage Levels [1]

User Type	Typical Monthly Usage
Daily Tab users only	Under $20
Limited Agent users	Often within $20
Daily Agent users	$60–$100/mo
Power users (multi-agent/automation)	$200+/mo

How to Minimize Request Usage

1. Choose Cheaper Models for Routine Work

The single biggest lever. Grok Code output costs $1.50/M tokens vs Claude Opus at $25/M tokens — a 16.7x difference. Use Opus only for complex architectural decisions; use Flash or Grok for routine edits, refactoring, and simple tasks [3].

2. Use Auto Mode

At $1.25/M input and $6/M output, Auto mode is cheaper than manually selecting most premium models. Cursor dynamically picks the best model for each task, optimizing for both quality and cost [3].

3. Avoid Max Mode Unless Necessary

Max Mode extends context to a model's maximum (up to 1M tokens for Claude) but adds a 20% upcharge on all token costs. Default 200k context is sufficient for most tasks [3].

4. Keep Conversations Short

Each message in a conversation carries forward the full context of all previous messages. Starting new conversations resets context and avoids ballooning input token costs. Long conversations compound rapidly because every new message re-sends the entire history.

5. Write Specific, Targeted Prompts

Vague prompts cause the model to explore broadly and generate more output tokens. Precise instructions ("change line 42 of app.js to use async/await") produce less output than "refactor this file to be more modern."

6. Leverage Tab Completions

Tab completions are unlimited on Pro and above and don't consume request usage in the same way as Agent requests [1]. For small, predictable edits, Tab is essentially free.

7. Use Cursor Rules and Context Wisely

Good .cursorrules and project rules mean the model needs fewer rounds of clarification, reducing total token consumption per task.

8. Set Enterprise Spend Controls

Set team-level spend limits to cap total on-demand overage [4]
Set per-member limits (Enterprise only) to prevent individual runaway usage [4]
Use model access restrictions to block expensive models org-wide [5]
Enable dynamic spend limits that scale with team size [4]
Configure spend alerts for email notifications at thresholds [4]

9. Monitor Via Dashboard and API

View per-user token breakdowns at cursor.com/dashboard?tab=usage [1]
Use the Admin API /spending endpoint for programmatic monitoring [2]
Use Analytics Dashboard to identify heavy users and usage patterns [5]

Full Summary

Cursor has moved from a request-count model to a token-based, dollar-denominated usage system. Every AI interaction (Agent chat, Composer, Cloud Agents) consumes tokens — input tokens for your prompt and context, output tokens for the model's response — priced at each model's public API rate per million tokens.

For Teams plan ($40/user/mo): each user gets $20/mo of included usage. When that's exhausted, on-demand usage continues automatically at the same rates, billed monthly. Usage is isolated per user — one person's heavy usage doesn't affect another's allocation. For Enterprise plan (custom pricing): the key differentiator is pooled usage. Instead of each user having an individual $20 bucket, the entire team shares a negotiated pool. This means a team of 50 where only 15 are heavy Agent users can effectively subsidize those 15 with the unused capacity of the other 35. Enterprise also adds per-member spend limits, billing groups for departmental chargebacks, model access restrictions to prevent users from selecting expensive models, and full programmatic control via the Admin API.

The cost math is straightforward: usage = tokens × price per model. A single Agent turn with Claude 4.6 Opus using 10k input tokens and 2k output tokens costs roughly $0.10. The same turn with Gemini 3 Flash costs about $0.01. Over hundreds of daily interactions across a team, this 10x difference is material.

Additional costs include the Cursor Token Fee ($0.25/M tokens on non-Auto requests) covering infrastructure like semantic search and code application, and the optional Max Mode upcharge (20% on all token costs) for extended 1M-token context windows.

The recommended strategy for enterprise teams is: default to Auto or Gemini Flash for routine work, allow premium models only for complex tasks, enforce per-member spend limits, and monitor usage patterns via the Analytics Dashboard to identify optimization opportunities.

References

[1]Cursor Docs — Pricing. https://cursor.com/docs/account/pricing
[2]Cursor Docs — Team Pricing. https://cursor.com/docs/account/teams/pricing
[3]Cursor Docs — Models. https://cursor.com/docs/models
[4]Cursor Docs — Spend Limits. https://cursor.com/docs/account/billing/spend-limits
[5]Cursor Docs — Enterprise. https://cursor.com/docs/enterprise
[6]Cursor Pricing Page. https://cursor.com/pricing