Services
Contact Us
No results found.

Cloud GPU Rental Price Index

Ekrem Sarı
Ekrem Sarı
updated on May 20, 2026

On-demand rates for the newest-generation cloud GPUs (B200, B300, MI300X, RTX 5090) roughly doubled over the past year, while mainstream cards (H100, H200, A100) held a tight band. We compile the GPU index monthly from 58 providers and 17 GPU models, covering on-demand, spot, and 1-year reserved tiers.

Loading Chart

The chart shows the monthly median posted price across three release-date buckets. We split 17 GPU models into three categories by launch date:

Most of the increase came from B200 and B300 listings expanding from neocloud providers to hyperscaler price sheets. These hyperscaler listings are typically 2x-3x higher, raising the category median as they enter the dataset.

Modern GPUs crept ~25% higher, but the move is largely statistical. Google Cloud added its A3z Mega H100 variant to the standard-A3 listing, lifting the H100 cohort median from ~$2 to ~$3. Underneath, neocloud H100 trended down. We flag this in the next section.

Legacy GPUs slid from $1.78 to $0.97 over the window, driven by the V100 cohort losing its high-end hyperscaler anchors as enterprises retire the SKU. Only one or two providers per Legacy card remain in our dataset: AWS lists K80 at $0.90, P40 sits at Vast.ai at $0.11, and the rest are similar single-listing edge cases.

The contract market moved differently: 1-year H100 commitments trended up over the same window, while our on-demand H100 median was roughly flat. This shows a widening difference between month-to-month and 1-year committed pricing.

See our GPU index methodology for how this is computed.

The chart below covers 10 GPUs: 5 Modern, 4 Last released, and the V100 as the Legacy reference.

Modern GPUs (H100, H200, A100, L40S, RTX 4090)

H100 is listed by 37 providers, the broadest of any current accelerator. The cohort median is now around $2.95/GPU-hour, down from above $7 in early 2024. Thunder Compute, Vast.ai, and RunPod sit at the bottom of the spread; Microsoft Azure and Google Cloud carry the upper tail past $10. The Google Cloud row is itself a mix of three SKUs (a3-highgpu, a3-megagpu, a3-edgegpu) collapsed under one nvidia-h100 label, which lifts its cohort median.

H200’s range runs from $2.29 (Theta EdgeCloud) to $13.78 (Microsoft Azure), with a cohort median around $3.39. The floor depends on whether you treat community-tier or instance-share listings as comparable to dedicated capacity. Once those are set aside, the working median sits in the $3-4 band.

A100 holds a tight neocloud band around $1.62, with one or two serverless-inference outliers (Replicate at $5.04) pulling the high tail up. Treat serverless rates separately when comparing IaaS providers.

L40S has settled around $1.55 median, with AWS at $7.58 setting the ceiling. RTX 4090 is the cheapest training-class card on the index at $0.44 median, with Salad at $0.18 and Beam at $1.61 bracketing the spread. Both target sub-100B inference and batch fine-tuning, where they often substitute for A100 at a fraction of the price.

Last released GPUs (B200, B300, MI300X, RTX 5090)

B200 median $5.24, range $3.75 (Packet AI) to $14.24 (AWS). B300 median $6.99, range $6.10 (Nebius) to $18.00 (Oracle). MI300X median $1.99, range $1.99 (RunPod) to $7.86 (Azure). RTX 5090 median $0.69, range $0.27 (Salad) to $1.34 (Vast.ai).

The pattern repeats from H100’s earlier curve: hyperscalers carry new accelerators at 3-5x neocloud floors during the first year. B300 is the only line on the chart still trending up, as additional hyperscaler listings keep raising the median. MI300X is the supply outlier; RunPod and TensorWave price it below the H100 floor, but it runs on ROCm and not every CUDA workload ports cleanly.

Legacy reference (V100)

V100 is the only Legacy card on the chart, included as a 2017-generation reference line. The cohort median dropped from $1.84 in mid-2024 to around $0.97 today across 18 providers. Hyperscalers maintain V100 SKUs for compliance customers running unchangeable workloads; Neoclouds have mostly dropped them.

For the same GPU, hyperscaler posted prices are typically 3x-6x higher than the lowest neocloud listings in the dataset. Catalog depth varies by provider, GPU, region, and billing type.

Supply and availability

Supply varies more widely than headline pricing. The chart below shows the share of each GPU’s listings reporting confirmed stock today, sorted from tightest to most available.

B300 sits at 6% confirmed; the remaining 94% are listed but providers do not yet promise the chip. MI300X and L40S land at 35-36%, narrower than the mainstream tier. H100, H200, A100, and B200 cluster near 41-51%, where roughly half the catalog is confirmed stock and half is provisioning-dependent. RTX 4090 and RTX 5090 reach 86%, reflecting deeper consumer-card supply and lower per-card enterprise demand.

If your project depends on a specific newest-generation chip, plan procurement lead time on top of budget. The waitlist share stays near zero because most unconfirmed listings are tracked as “unknown stock”, not “waitlist”: providers report stock state, not queue position.

Choosing a GPU and provider

GPU choice is shaped by three axes: workload, duration, and region. Spot vs. on-demand pricing layers on top of all three.

By workload

By duration

Under a week: Neocloud on-demand at the floor of the spread.

Multi-week: Request a quote (Neoclouds typically discount 15-30% for 4-12 week commitments; hyperscalers offer 1-year reserved tiers).

Multi-year: negotiate directly with providers, since posted on-demand rates do not capture committed-term discounts.

Reservation savings

The 1-year reserved discount typically runs 9-32% off the posted on-demand rate, with the steeper savings on AMD MI300X and the inference-tier L40S, where providers compete harder for committed capacity.

H100 and H200 see modest single-digit-to-low-teens discounts; their on-demand market is already competitive enough that providers do not sacrifice much margin for commitments. B200 reserves at -20% off, MI300X at -32%, L40S at -29%. The chart shows the cross-provider median for both billing tiers; individual provider quotes may go deeper for multi-year terms not reflected here.

Spot vs on-demand

The spot discount tracker chart shows the median spot vs. on-demand discount by category. Over the past six months, modern saves ~50%, last released ~48%, legacy ~77% (Legacy is noisier than it looks; few providers still publish spot rates for these cards).

If your workload tolerates 5-15 minute interruptions, spot is the single biggest cost lever available. Toggle the billing dropdown in the explorer chart at the top to see the spot rate side-by-side with on-demand for any provider on your shortlist.

GPU index methodology

The index covers posted hourly cloud GPU rental prices across on-demand, spot, and 1-year reserved tiers (where providers publicly list them). It does not cover multi-year contracts, enterprise-negotiated rates, spot-plus-savings-plan combinations, or total cost of ownership.

Our data is monthly snapshots over 23 months (July 2024 through May 2026), filtered to 17 curated GPU models across 58 providers. Each snapshot reports, for every (provider, GPU, billing type, month) cell, the min, max, mean, and median per-GPU hourly rate, plus the offering count behind those numbers.

H100, A100, H200, B200, B300 and V100 prices are medians taken across several physical versions of the card (PCIe, SXM or NVL interconnect; for A100 and V100, also 40/80 GB or 16/32 GB VRAM) that providers list under one name.

How each chart is calculated

We use median-of-medians throughout: providers and GPUs each enter the headline number with equal weight, so a 38-listing provider does not drown out a 5-listing newcomer.

Market summary (three category lines):

The billing dropdown re-runs Steps 2-3 against the selected tier (on-demand, spot, or reservation). A fourth “Average” option plots the arithmetic mean of the three-tier medians per category per month, restricted to months where all three tiers have data.

Provider × billing explorer:

For the provider and billing tier you select, each line traces one GPU’s monthly median over time. No cross-provider aggregation is applied: each month’s point is simply the median price across that provider’s listings for that GPU and that billing tier. The line ends where the offering disappears from the catalog.

Modern GPUs side-by-side:

Same Steps 1-2 as the market summary, scoped to on-demand pricing only. Each line is the cross-provider monthly median for one GPU. No cross-GPU aggregation. Eight series.

Spot discount tracker:

This pairs each spot price against its same-provider, same-GPU, same-month on-demand counterpart, so the discount reflects the actual spread a buyer at that provider would see, not a cross-market noise difference.

Availability snapshot:

Snapshot only, no time aggregation. Listings reported as unknown stock, waitlist, or unavailable are still counted in the denominator but not drawn separately on the chart, since the buyer-actionable signal is the confirmed-available share.

Reservation savings:

FAQs

We publish a refreshed monthly median view each month. The numbers reflect data through the prior month.

The GPU is the same; the bundle is not. Hyperscalers price in compliance (HIPAA, SOC 2, FedRAMP), enterprise SLAs, identity and networking integration, and 24/7 support. Neoclouds price bare metal or VM access with optional managed orchestration. If you do not need the bundle, the Neocloud price is the right comparison.

Yes, if your workload checkpoints and tolerates 5-15 minute interruptions. Modern GPU spot discount sits near 50% over the past six months, and savings compound over multi-day training. Spot is the wrong choice for latency-sensitive inference, single-replica services without failover, or evaluation runs that need a clean wall-clock comparison.

Price trends by provider chart’s billing dropdown switches between on-demand, spot, and 1-year reserved tiers wherever providers publish those rates. Multi-year contracts and enterprise-negotiated discounts are not included. Request a quote directly from the provider for those.

Further reading

Cite this research

Pick the format that matches where you're publishing. Pasting the link version into your CMS preserves the backlink.

Ekrem Sarı (2026) - "Cloud GPU Rental Price Index". Published online at AIMultiple.com. Retrieved May 20, 2026, from: https://aimultiple.com/gpu-index [Online Resource]

Sarı, E. (2026, May 20). Cloud GPU Rental Price Index. AIMultiple. https://aimultiple.com/gpu-index

@misc{sar2026,
  author = {Sarı, Ekrem},
  title  = {{Cloud GPU Rental Price Index}},
  year   = {2026},
  month  = may,
  howpublished    = {\url{https://aimultiple.com/gpu-index}},
  note   = {AIMultiple. Retrieved May 20, 2026}
}
Ekrem Sarı
Ekrem Sarı
AI Researcher
Ekrem is an AI Researcher at AIMultiple, focusing on intelligent automation, GPUs, AI Agents, and RAG frameworks.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required.

0/450