Hizmetler
Bize Ulaşın
Sonuç bulunamadı.

Cloud GPU Rental Price Index

Ekrem Sarı
Ekrem Sarı
güncellendi May 13, 2026
Bakınız etik normlar

Posted on-demand rates for the newest-generation cloud GPUs (B200, B300, MI300X, RTX 5090) roughly doubled over the past year, while mainstream cards (H100, H200, A100) held a tight band. We compile the GPU index monthly from 58 providers and 17 GPU models, covering on-demand, spot, and 1-year reserved tiers.

Loading Chart

For the same GPU, hyperscalers list 3-6x above neocloud floors. Catalog depth varies from single-GPU specialists to 40+ SKUs at the largest aggregators (Vast.ai, Salad).

See our GPU index methodology for how this is computed.

The chart shows the monthly median posted price across three release-date buckets.

We split 17 GPU models into three categories by launch date:

Last released nearly doubled in 14 months. Most of the move came from B200 and B300 expanding out of neocloud-only listings and into hyperscaler price sheets, where the headline rate is 2-3x higher. Every new high-tier listing pulls the category median up.

Modern crept ~25% higher, but the move is largely statistical. Google Cloud added its A3 Mega H100 variant to the standard-A3 listing, lifting the H100 cohort median from ~$2 to ~$3. Underneath, neocloud H100 trended down. We flag this in the next section.

Legacy slid from $1.78 to $0.97 over the window, driven by the V100 cohort losing its high-end hyperscaler anchors as enterprises retire the SKU. Only one or two providers per Legacy card remain in our dataset: AWS lists K80 at $0.90, P40 sits at Vast.ai at $0.11, and the rest are similar single-listing edge cases.

The contract market moved differently: 1-year H100 commitments trended up over the same window, while our on-demand H100 median was roughly flat. That gap is the price of paying month-to-month versus committing for a year.

The chart below covers the eight Modern GPUs.

Modern GPUs (H100, A100, L4 and peers)

H100 is the workhorse across 36 providers. The cohort median came from $7+ in early 2024 to under $3 in 2026, except where high-end SKU listings (Azure ND, GCP A3 Mega) lift the high tail. Thunder Compute, Vast.ai, and RunPod consistently sit at the bottom of the spread; AWS, Azure, and Google Cloud charge multiples of that for the SLA, compliance, and bundled cross-service integration. The Google Cloud row is itself a mix of three SKUs (a3-highgpu, a3-megagpu, a3-edgegpu) collapsed under one nvidia-h100 label, which lifts its cohort median.

H200’s price floor looks too good to trust. RunPod lists capacity at a fraction of the cohort median; the next provider is several multiples up. Either RunPod is clearing inventory, or the listing is a community-tier instance-share misattributing per-GPU rate. Once outliers are set aside, the working median sits in the $3-4 band.

A100 holds a tight neocloud band, with one or two serverless-inference outliers pulling the high tail up. Treat serverless rates (Replicate) separately when comparing IaaS providers.

L40S, RTX 4090, A10G, T4, and L4 cover the inference tier. Their workloads overlap (sub-100B inference, generation, batch fine-tuning), so they compete on price. A10G’s narrow spread reflects that it is effectively an AWS-only SKU on our list.

Last released GPUs (B200, B300, MI300X, RTX 5090)

B200 median $5.50, range $3.75 (Packet AI) to $14.24 (AWS). B300 median $7.50, range $6.10 (Nebius) to $17.80 (AWS). MI300X median $2.72, range $0.50 (RunPod) to $7.86 (Azure). RTX 5090 median $0.67, range $0.27 (Salad) to $0.72 (Novita).

The pattern repeats from H100’s earlier curve: hyperscalers carry new accelerators at ~3x the neocloud price during the first year. MI300X is the supply outlier; RunPod and TensorWave price it below the H100 floor, but it runs on ROCm and not every CUDA workload ports cleanly.

Legacy GPUs (V100, P100, K80, M60, P40)

V100 still appears across 16 providers (median ~$0.97), P100 at 5 (median ~$1.46), K80 only at AWS ($0.90), and P40 only at Vast.ai (~$0.11). Hyperscalers maintain Legacy SKUs for compliance customers running unchangeable workloads; neoclouds dropped them. If you do not already have a legacy pipeline on these cards, there is nothing left to migrate to.

Supply and availability

Supply varies more widely than headline pricing. The chart below shows the share of each GPU’s listings reporting confirmed stock today, sorted from tightest to most available.

B300 sits at 6% confirmed; the remaining 94% are listed but providers do not yet promise the chip. MI300X and L40S land at 35-36%, narrower than the mainstream tier. H100, H200, A100, and B200 cluster near 41-51%, where roughly half the catalog is confirmed stock and half is provisioning-dependent. RTX 4090 and RTX 5090 reach 86%, reflecting deeper consumer-card supply and lower per-card enterprise demand.

If your project depends on a specific newest-generation chip, plan procurement lead time on top of budget. The waitlist share stays near zero because most unconfirmed listings are tracked as “unknown stock”, not “waitlist”: providers report stock state, not queue position.

Choosing a GPU and provider

GPU choice is shaped by three axes: workload, duration, and region. Spot vs. on-demand pricing layers on top of all three.

By workload

To get up to date on enterprise AI and software, follow us:
Cem Dilmegani
Cem Dilmegani
Principal Analyst

By duration

Under a week: Neocloud on-demand at the floor of the spread.

Multi-week: Request a quote (Neoclouds typically discount 15-30% for 4-12 week commitments; hyperscalers offer 1-year reserved tiers).

Multi-year: negotiate directly with providers, since posted on-demand rates do not capture committed-term discounts.

Reservation savings

The 1-year reserved discount typically runs 9-32% off the posted on-demand rate, with the steeper savings on AMD MI300X and the inference-tier L40S, where providers compete harder for committed capacity.

H100 and H200 see modest single-digit-to-low-teens discounts; their on-demand market is already competitive enough that providers do not sacrifice much margin for commitments. B200 reserves at -20% off, MI300X at -32%, L40S at -29%. The chart shows the cross-provider median for both billing tiers; individual provider quotes may go deeper for multi-year terms not reflected here.

Spot vs on-demand

The spot discount tracker chart shows the median spot vs. on-demand discount by category. Over the past six months, modern saves ~50%, last released ~48%, legacy ~77% (Legacy is noisier than it looks; few providers still publish spot rates for these cards).

If your workload tolerates 5-15 minute interruptions, spot is the single biggest cost lever available. Toggle the billing dropdown in the explorer chart at the top to see the spot rate side-by-side with on-demand for any provider on your shortlist.

GPU index methodology

The index covers posted hourly cloud GPU rental prices across on-demand, spot, and 1-year reserved tiers (where providers publicly list them). It does not cover multi-year contracts, enterprise-negotiated rates, spot-plus-savings-plan combinations, or total cost of ownership.

Our data is monthly snapshots over 23 months, filtered to 17 curated GPU models across 58 providers. Each snapshot reports, for every (provider, GPU, billing type, month) cell, the min, max, mean, and median per-GPU hourly rate, plus the offering count behind those numbers.

How each chart is calculated

We use median-of-medians throughout: providers and GPUs each enter the headline number with equal weight, so a 38-listing provider does not drown out a 5-listing newcomer.

Provider × billing explorer (opening chart):

For the provider and billing tier you select, each line traces one GPU’s monthly median over time. No cross-provider aggregation is applied: each month’s point is simply the median price across that provider’s listings for that GPU and that billing tier. The line ends where the offering disappears from the catalog.

Market summary (three category lines):

The billing dropdown re-runs Steps 2-3 against the selected tier (on-demand, spot, or reservation). Nine series in total.

Modern GPUs side-by-side:

Same steps 1-2 as the market summary, scoped to on-demand pricing only. Each line is the cross-provider monthly median for one GPU. No cross-GPU aggregation. Eight series.

Spot discount tracker:

This pairs each spot price against its same-provider, same-GPU, same-month on-demand counterpart, so the discount reflects the actual spread a buyer at that provider would see, not a cross-market noise difference.

Availability snapshot:

Snapshot only, no time aggregation. Listings reported as unknown stock, waitlist, or unavailable are still counted in the denominator but not drawn separately on the chart, since the buyer-actionable signal is the confirmed-available share.

Reservation savings:

Snapshot only, latest week. The reservation tier here pools 12-month and longer commitments where providers publicly list them; multi-year contracts negotiated off-list are not included.

SSS'ler

We publish a refreshed monthly median view each month. The numbers reflect data through the prior month.

The GPU is the same; the bundle is not. Hyperscalers price in compliance (HIPAA, SOC 2, FedRAMP), enterprise SLAs, identity and networking integration, and 24/7 support. Neoclouds price bare metal or VM access with optional managed orchestration. If you do not need the bundle, the Neocloud price is the right comparison.

Yes, if your workload checkpoints and tolerates 5-15 minute interruptions. Modern GPU spot discount sits near 50% over the past six months, and savings compound over multi-day training. Spot is the wrong choice for latency-sensitive inference, single-replica services without failover, or evaluation runs that need a clean wall-clock comparison.

Price trends by provider chart’s billing dropdown switches between on-demand, spot, and 1-year reserved tiers wherever providers publish those rates. Multi-year contracts and enterprise-negotiated discounts are not included. Request a quote directly from the provider for those.

Further reading

Ekrem Sarı
Ekrem Sarı
Yapay Zeka Araştırmacısı
Ekrem, AIMultiple'da yapay zeka araştırmacısı olarak çalışmakta olup, akıllı otomasyon, GPU'lar, yapay zeka ajanları ve RAG çerçeveleri üzerine yoğunlaşmaktadır.
Tam Profili Görüntüle

Yorum yapan ilk kişi olun

E-posta adresiniz yayınlanmayacak. Tüm alanlar gereklidir.

0/450