Feature Comparison

Cloud GPU Rental Price Index

updated on Jul 6, 2026

On-demand rates for the newest-generation cloud GPUs (B200, B300, MI300X, RTX 5090) roughly doubled over the past year, while mainstream cards (H100, H200, A100) held a tight band. We compile the GPU index monthly from 63 providers and 17 GPU models, covering on-demand, spot, and 1-year reserved tiers.

Price trends by GPU generation

Billing type

Loading Chart

The chart shows the monthly median posted price across three release-date buckets. We split 17 GPU models into three categories by launch date:

Category	GPUs	Role
Last released (2024 and later)	B200, B300, MI300X, RTX 5090	Newest-generation cohort
Modern (2020 through 2023)	H100, H200, A100, L40S, RTX 4090, A10G, T4, L4	Mainstream workload-runners
Legacy (pre-2020)	V100, P100, K80, M60, P40	Still rentable, mostly community-tier neoclouds

Most of the increase came from B200 and B300 listings expanding from neocloud providers to hyperscaler price sheets. These hyperscaler listings are typically 2x-3x higher, raising the category median as they enter the dataset.

Modern GPUs crept ~25% higher, but the move is largely statistical. Google Cloud added its A3z Mega H100 variant to the standard-A3 listing, lifting the H100 cohort median from ~$2 to ~$3. Underneath, neocloud H100 trended down. We flag this in the next section.

Legacy GPUs slid from $1.78 to $0.99 over the window, driven by the V100 cohort losing its high-end hyperscaler anchors as enterprises retire the SKU. One or two providers per Legacy card remain in our dataset: AWS lists K80 at $0.90, P40 sits at Vast.ai at $0.11, and the rest are similar single-listing edge cases.

The contract market moved differently: 1-year H100 commitments trended up over the same window, while our on-demand H100 median was roughly flat. This shows a widening difference between month-to-month and 1-year committed pricing.

See our GPU index methodology for how this is computed.

Price trends by GPU model

The chart below covers 10 GPUs: 5 Modern, 4 Last released, and the V100 as the Legacy reference.

Modern GPUs (H100, H200, A100, L40S, RTX 4090)

IONOS covers this tier from the EU: On-demand T4, A10, and RTX PRO 6000 Blackwell, plus dedicated H100 and H200 servers at a flat $3,990/month with EU data residency.

H100 is listed by 46 providers, the broadest of any current accelerator. The cohort median is now around $2.99/GPU-hour, down from above $7 in early 2024. Thunder Compute, Vast.ai, and RunPod sit at the bottom of the spread; Microsoft Azure and Google Cloud carry the upper tail past $10. The Google Cloud row is itself a mix of three SKUs (a3-highgpu, a3-megagpu, a3-edgegpu) collapsed under one nvidia-h100 label, which lifts its cohort median.

H200’s range runs from $2.30 (FluidStack) to $13.78 (Microsoft Azure), with a cohort median around $4.00. The floor depends on whether you treat community-tier or instance-share listings as comparable to dedicated capacity. Once those are set aside, the working median sits in the $3-4 band.

A100 holds a tight neocloud band around $1.79, with one or two serverless-inference outliers (Replicate at $5.04) pulling the high tail up. Treat serverless rates separately when comparing IaaS providers.

L40S has settled around $1.56 median, with AWS at $7.58 setting the ceiling. RTX 4090 is the cheapest training-class card on the index at $0.52 median, with Salad at $0.18 and Beam at $1.61 bracketing the spread. Both target sub-100B inference and batch fine-tuning, where they often substitute for A100 at a fraction of the price.

Last released GPUs (B200, B300, MI300X, RTX 5090)

B200 median $6.11, range $3.44 (Vast.ai) to $16.11 (Google Cloud). B300 median $7.92, range $5.44 (Vast.ai) to $18.00 (Oracle Cloud). MI300X median $2.72, range $1.99 (DigitalOcean) to $7.86 (Microsoft Azure). RTX 5090 median $0.66, range $0.27 (Salad) to $2.00 (Vast.ai).

The pattern repeats from H100’s earlier curve: hyperscalers carry new accelerators at 3-5x neocloud floors during the first year. B300 is still trending up on the chart, as additional hyperscaler listings keep raising the median. MI300X is the supply outlier; it lists below the H100 floor at DigitalOcean and TensorWave, but it runs on ROCm and not every CUDA workload ports cleanly.

Legacy reference (V100)

V100, the Legacy card on the chart, is included as a 2017-generation reference line. The cohort median dropped from $1.84 in mid-2024 to around $0.99 today across 17 providers. Hyperscalers maintain V100 SKUs for compliance customers running unchangeable workloads; Neoclouds have mostly dropped them.

Price trends by provider

For the same GPU, hyperscaler posted prices are typically 3x-6x higher than the lowest neocloud listings in the dataset. Catalog depth varies by provider, GPU, region, and billing type.

Supply and availability

Supply varies more widely than headline pricing. The chart below shows the share of each GPU’s listings reporting confirmed stock today, sorted from tightest to most available.

MI300X and L40S are the tightest at 44%, with B200 and B300 behind at 52-54%. H100, A100, and H200 cluster near 63-70%, where roughly two-thirds of the catalog is confirmed stock and the rest is provisioning-dependent. RTX 4090 and RTX 5090 reach 93-97%, reflecting deeper consumer-card supply and lower per-card enterprise demand.

Get our team to automate one of your business processes with AI agents, free of charge.

Automate a process

Choosing a GPU and provider

GPU choice is shaped by three axes: workload, duration, and region. Spot vs. on-demand pricing layers on top of all three.

By workload

Workload	Recommended GPU	Provider tier	Why
LLM inference, 7-13B models	L4, L40S	Neocloud	Sub-$2/hr, inference-optimized
LLM inference, 30-70B	A100 80GB, H100	Neocloud	VRAM fits, H100 for tight latency SLA
LLM inference, 70B+ memory-bound	H200, MI300X	Neocloud	141-192 GB HBM enables larger KV-cache
Fine-tuning 7-13B	A100, H100	Neocloud	Cost-efficient, widely available
Training large models from scratch	H100, B200 multi-node	Hyperscaler or large neocloud	Multi-GPU HBM and fast interconnect
Experimentation, prototyping	T4, A10G, L4, RTX 4090	Community-tier neocloud	Cheap hourly, fast to spin up
Regulated production (HIPAA/SOC2/FedRAMP)	Any above	Hyperscaler	Compliance certifications

By duration

Under a week: Neocloud on-demand at the floor of the spread.

Multi-week: Request a quote (Neoclouds typically discount 15-30% for 4-12 week commitments; hyperscalers offer 1-year reserved tiers).

Multi-year: negotiate directly with providers, since posted on-demand rates do not capture committed-term discounts.

Reservation savings

The 1-year reserved discount typically runs 16-39% off the posted on-demand rate, with the steepest savings on B200, AMD MI300X, and the inference-tier L40S, where providers compete harder for committed capacity.

H100 and H200 see modest single-digit-to-low-teens discounts; their on-demand market is competitive enough that providers do not sacrifice margin for commitments. B200 reserves at -39% off, MI300X at -31%, L40S at -30%. The chart shows the cross-provider median for both billing tiers; individual provider quotes may go deeper for multi-year terms not reflected here.

Spot vs on-demand

The spot discount tracker chart shows the median spot vs. on-demand discount by category. Over the past six months, modern saves ~50%, last released ~49%, legacy ~75% (Legacy is noisier than it looks; few providers still publish spot rates for these cards).

If your workload tolerates 5-15 minute interruptions, spot is the single biggest cost lever available. Toggle the billing dropdown in the explorer chart at the top to see the spot rate side-by-side with on-demand for any provider on your shortlist.

GPU index methodology

The index covers posted hourly cloud GPU rental prices across on-demand, spot, and 1-year reserved tiers (where providers publicly list them). It does not cover multi-year contracts, enterprise-negotiated rates, spot-plus-savings-plan combinations, or total cost of ownership.

Our data is monthly snapshots over 24 months (July 2024 through June 2026), filtered to 17 curated GPU models across 63 providers. Each snapshot reports, for every (provider, GPU, billing type, month) cell, the min, max, mean, and median per-GPU hourly rate, plus the offering count behind those numbers.

H100, A100, H200, B200, B300 and V100 prices are medians taken across several physical versions of the card (PCIe, SXM or NVL interconnect; for A100 and V100, also 40/80 GB or 16/32 GB VRAM) that providers list under one name.

How each chart is calculated

We use median-of-medians throughout: providers and GPUs each enter the headline number with equal weight, so a 38-listing provider does not drown out a 5-listing newcomer.

Market summary (three category lines):

1Step 1  For each provider + GPU + billing tier + month, take the median price.
2Step 2  Take the median across providers, leaving one value per GPU + billing tier + month.
3Step 3  Take the median across GPUs in the same category, leaving one value per category + billing tier + month.

The billing dropdown re-runs Steps 2-3 against the selected tier (on-demand, spot, or reservation). A fourth “Average” option plots the arithmetic mean of the three-tier medians per category per month, restricted to months where all three tiers have data.

Provider × billing explorer:

For the provider and billing tier you select, each line traces one GPU’s monthly median over time. No cross-provider aggregation is applied: each month’s point is the median price across that provider’s listings for that GPU and that billing tier. The line ends where the offering disappears from the catalog.

Modern GPUs side-by-side:

Same Steps 1-2 as the market summary, scoped to on-demand pricing. Each line is the cross-provider monthly median for one GPU. No cross-GPU aggregation. Eight series.

Spot discount tracker:

1Step 1  Keep the provider + GPU + month cells where both an on-demand and a spot price exist.
2Step 2  Per cell, compute: discount % = (on-demand − spot) / on-demand × 100.
3Step 3  Per GPU per month, take the median discount across providers.
4Step 4  Per category per month, take the median discount across GPUs.

This pairs each spot price against its same-provider, same-GPU, same-month on-demand counterpart, so the discount reflects the actual spread a buyer at that provider would see, not a cross-market noise difference.

Availability snapshot:

1Step 1  Take the current snapshot of listings, one row per provider + SKU + billing tier.
2Step 2  Per GPU, compute: % available = confirmed listings / total listings × 100.
3Step 3  Sort GPUs ascending by % available, so tightest supply appears leftmost.

Snapshot only, no time aggregation. Listings reported as unknown stock, waitlist, or unavailable are still counted in the denominator but not drawn separately on the chart, since the buyer-actionable signal is the confirmed-available share.

Reservation savings:

1Step 1  Filter the latest weekly snapshot to on-demand and reservation listings.
2Step 2  Per GPU per tier, take the median of cross-provider monthly medians.
3Step 3  Pair the two tiers per GPU and render as grouped bars.

See more of our benchmarks and data-driven insights in Google Search.

Add as preferred source

FAQs

We publish a refreshed monthly median view each month. The numbers reflect data through the prior month.

The GPU is the same; the bundle is not. Hyperscalers price in compliance (HIPAA, SOC 2, FedRAMP), enterprise SLAs, identity and networking integration, and 24/7 support. Neoclouds price bare metal or VM access with optional managed orchestration. If you do not need the bundle, the Neocloud price is the right comparison.

Yes, if your workload checkpoints and tolerates 5-15 minute interruptions. Modern GPU spot discount sits near 50% over the past six months, and savings compound over multi-day training. Spot is the wrong choice for latency-sensitive inference, single-replica services without failover, or evaluation runs that need a clean wall-clock comparison.

Price trends by provider chart’s billing dropdown switches between on-demand, spot, and 1-year reserved tiers wherever providers publish those rates. Multi-year contracts and enterprise-negotiated discounts are not included. Request a quote directly from the provider for those.

Cite this research

Pick the format that matches where you're publishing. Pasting the link version into your CMS preserves the backlink.

Ekrem Sarı (2026) - "Cloud GPU Rental Price Index". Published online at AIMultiple.com. Retrieved July 6, 2026, from: https://aimultiple.com/gpu-index [Online Resource]

Sarı, E. (2026, July 6). Cloud GPU Rental Price Index. AIMultiple. https://aimultiple.com/gpu-index

@misc{sar2026,
  author = {Sarı, Ekrem},
  title  = {{Cloud GPU Rental Price Index}},
  year   = {2026},
  month  = jul,
  howpublished    = {\url{https://aimultiple.com/gpu-index}},
  note   = {AIMultiple. Retrieved July 6, 2026}
}

Ekrem Sarı

AI Researcher

Follow On

Ekrem is an AI Researcher and Data Analyst at AIMultiple. He designs and runs hands-on benchmarks for AI and LLM systems.

View Full Profile