Fibre
Contattaci
Nessun risultato trovato.

Cloud GPU Rental Price Index

Ekrem Sarı
Ekrem Sarı
aggiornato il Mag 13, 2026
Guarda il nostro norme etiche

Posted on-demand rates for the newest-generation cloud GPUs (B200, B300, MI300X, RTX 5090) roughly doubled over the past year, while mainstream cards (H100, H200, A100) held a tight band. We compile the GPU index monthly from 52 providers and 17 GPU models, covering on-demand, spot, and 1-year reserved tiers.

Loading Chart

For the same GPU, hyperscalers list 3-6x above neocloud floors. Catalog depth varies from single-GPU specialists to 30+ SKUs at the major clouds.

See our GPU index methodology for how this is computed.

The chart shows the monthly median posted price across three release-date buckets.

We split 17 GPU models into three categories by launch date:

Last released nearly doubled in 14 months. Most of the move came from B200 and B300 expanding out of neocloud-only listings and into hyperscaler price sheets, where the headline rate is 2-3x higher. Every new high-tier listing pulls the category median up.

Modern crept ~30% higher, but the move is largely statistical. Google Cloud added its A3 Mega H100 variant to the standard-A3 listing, lifting the H100 cohort median from ~$2 to ~$3. Underneath, neocloud H100 trended down. We flag this in the next section.

Legacy declined from $1.77 to $1.14 through 2025 and ticked back up to $1.34 by May. The reversal is sampling noise, not a real price hike. Only one or two providers per Legacy card remain in our dataset: AWS lists K80 at $0.90, P40 dropped out of the catalog after Vast.ai delisted it, and the rest are similar single-listing edge cases.

The contract market moved differently: 1-year H100 commitments trended up over the same window, while our on-demand H100 median was roughly flat. That gap is the price of paying month-to-month versus committing for a year.

The chart below covers the eight Modern GPUs.

Modern GPUs (H100, A100, L4 and peers)

H100 is the workhorse across 33 providers. The cohort median came from $7+ in early 2024 to under $3 in 2026, except where high-end SKU listings (Azure ND, GCP A3 Mega) lift the high tail. Thunder Compute, Vast.ai, and RunPod consistently sit at the bottom of the spread; AWS, Azure, and Google Cloud charge multiples of that for the SLA, compliance, and bundled cross-service integration. The Google Cloud row is itself a mix of three SKUs (a3-highgpu, a3-megagpu, a3-edgegpu) collapsed under one nvidia-h100 label, which lifts its cohort median.

H200’s price floor looks too good to trust. RunPod lists capacity at a fraction of the cohort median; the next provider is several multiples up. Either RunPod is clearing inventory, or the listing is a community-tier instance-share misattributing per-GPU rate. Once outliers are set aside, the working median sits in the $3-4 band.

A100 holds a tight neocloud band, with one or two serverless-inference outliers pulling the high tail up. Treat serverless rates (Replicate) separately when comparing IaaS providers.

L40S, RTX 4090, A10G, T4, and L4 cover the inference tier. Their workloads overlap (sub-100B inference, generation, batch fine-tuning), so they compete on price. A10G’s narrow spread reflects that it is effectively an AWS-only SKU on our list.

Last released GPUs (B200, B300, MI300X, RTX 5090)

B200 median $5.50, range $3.75 (Packet AI) to $14.24 (AWS). B300 median $7.82, range $6.10 (Nebius) to $17.80 (AWS). MI300X median $2.72, range $0.50 (RunPod) to $7.86 (Azure). RTX 5090 median $0.67, range $0.27 (Salad) to $0.72 (Novita).

The pattern repeats from H100’s earlier curve: hyperscalers carry new accelerators at ~3x the neocloud price during the first year. MI300X is the supply outlier; RunPod and TensorWave price it below the H100 floor, but it runs on ROCm and not every CUDA workload ports cleanly.

Legacy GPUs (V100, P100, K80, M60, P40)

V100 still appears across 14 providers (median ~$1.67), P100 at 4 (median ~$1.55), K80 only at AWS ($0.90), and P40 was a single-provider Vast.ai listing (~$0.11) until it dropped out of the catalog in May. Hyperscalers maintain Legacy SKUs for compliance customers running unchangeable workloads; neoclouds dropped them. If you do not already have a legacy pipeline on these cards, there is nothing left to migrate to.

Choosing a GPU and provider

GPU choice is shaped by three axes: workload, duration, and region. Spot vs. on-demand pricing layers on top of all three.

By workload

To get up to date on enterprise AI and software, follow us:
Cem Dilmegani
Cem Dilmegani
Principal Analyst

By duration

Under a week: Neocloud on-demand at the floor of the spread.

Multi-week: Request a quote (Neoclouds typically discount 15-30% for 4-12 week commitments; hyperscalers offer 1-year reserved tiers).

Multi-year: negotiate directly with providers, since posted on-demand rates do not capture committed-term discounts.

Spot vs on-demand

The spot discount tracker chart shows the median spot vs. on-demand discount by category. Over the past six months, modern saves ~50%, Last released ~43%, Legacy ~80% (Legacy is noisier than it looks; few providers still publish spot rates for these cards).

If your workload tolerates 5-15 minute interruptions, spot is the single biggest cost lever available. Toggle the billing dropdown in the explorer chart at the top to see the spot rate side-by-side with on-demand for any provider on your shortlist.

GPU index methodology

The index covers posted hourly cloud GPU rental prices across on-demand, spot, and 1-year reserved tiers (where providers publicly list them). It does not cover multi-year contracts, enterprise-negotiated rates, spot-plus-savings-plan combinations, or total cost of ownership.

Our data is monthly snapshots over 23 months (July 2024 through May 2026), filtered to 17 curated GPU models across 52 providers. Each snapshot reports, for every (provider, GPU, billing type, month) cell, the min, max, mean, and median per-GPU hourly rate, plus the offering count behind those numbers.

How each chart is calculated

We use median-of-medians throughout: providers and GPUs each enter the headline number with equal weight, so a 38-listing provider does not drown out a 5-listing newcomer.

Provider × billing explorer:

No cross-provider aggregation. For the selected (provider, billing), each line is one GPU:

Each provider’s offering count varies; the line ends where the offering disappears from the catalog.

Market summary:

The billing dropdown re-runs steps 2-3 against the selected billing type (on-demand, spot, or reservation). 9 series in total.

Modern GPUs side-by-side:

Same Steps 1-2 as the market summary, scoped to billing = ON_DEMAND. Each line is the cross-provider monthly median for one GPU. No cross-GPU aggregation. 8 series.

Spot discount tracker:

This pairs each spot price against its same-provider, same-GPU, same-month on-demand counterpart, so the discount reflects the actual spread a buyer at that provider would see, not a cross-market noise difference.

FAQ

We publish a refreshed monthly median view on the 15th of each month. The numbers reflect data through the prior month.

The GPU is the same; the bundle is not. Hyperscalers price in compliance (HIPAA, SOC 2, FedRAMP), enterprise SLAs, identity and networking integration, and 24/7 support. Neoclouds price bare metal or VM access with optional managed orchestration. If you do not need the bundle, the Neocloud price is the right comparison.

Yes, if your workload checkpoints and tolerates 5-15 minute interruptions. Modern GPU spot discount sits near 50% over the past six months, and savings compound over multi-day training. Spot is the wrong choice for latency-sensitive inference, single-replica services without failover, or evaluation runs that need a clean wall-clock comparison.

Price trends by provider chart’s billing dropdown switches between on-demand, spot, and 1-year reserved tiers wherever providers publish those rates. Multi-year contracts and enterprise-negotiated discounts are not included. Request a quote directly from the provider for those.

Further reading

Ekrem Sarı
Ekrem Sarı
Ricercatore di intelligenza artificiale
Ekrem è un ricercatore di intelligenza artificiale presso AIMultiple, specializzato in automazione intelligente, GPU, agenti di intelligenza artificiale e framework RAG.
Visualizza il profilo completo

Sii il primo a commentare

Il tuo indirizzo email non verrà pubblicato. Tutti i campi sono obbligatori.

0/450