Cloud GPU providers fall into three tiers. Hyperscalers run broad cloud platforms with GPU rental as one product among many. Specialist neoclouds focus on GPU and AI infrastructure as their core product. Community marketplaces aggregate inventory from many small operators, often at the floor of the published price spread.
We track 64 cloud GPU providers and 14 curated GPU model families spanning more than 2,500 distinct instance configurations.
Pricing by provider tier
Pick a GPU model and a provider tier to compare on-demand price trajectories within that tier over the past 23 months.
Provider comparison table
Column definitions:
- Models: distinct GPU model families offered across all vendors (NVIDIA + AMD + Intel). H100 and H100 NVL count as one family.
- Combinations: distinct (GPU model, GPU count) instance SKUs across the public catalog.
- Billing tiers: how many of On-demand, Spot, and 1-year Reserved the provider exposes (max 3).
- Leading-edge: Yes if the provider lists any of B200, B300, MI300X, or RTX 5090.
Ranking: sponsors are linked and highlighted at the top of the table. The remaining providers are ranked by catalog depth (the Combinations column) in descending order.
IONOS
IONOS is a European cloud platform headquartered in Germany. The public GPU catalog covers four single-GPU SKUs:
The RTX PRO 6000 Blackwell SKU is one of the few publicly listed Blackwell-generation cards at sub-$2/GPU/hr on the EU side. IONOS does not list H100 publicly. Billing is on-demand only, with a posted monthly maximum cap on each SKU. Hosting and data centers are in the EU/EEA, which matters for buyers with GDPR data-residency requirements.
Hyperscaler providers
Hyperscalers run broad cloud platforms with GPU rental as one product among many, alongside compute, storage, networking, identity, and managed services. They typically price 3-6x above specialist neoclouds for the same GPU because rented capacity comes bundled with enterprise SLA, compliance certifications, and cross-service integration.
Amazon Web Services
AWS is the largest hyperscaler. The catalog spans 15 GPU model families and posts H100 through the p5 instance family. Two billing tiers are publicly listed (on-demand and 1-year reserved); spot pricing exists but routes through a separate request flow. EC2 G7e was added in early 2026 with NVIDIA RTX PRO 6000 Blackwell, initially in us-east-1 and us-east-2.1 2 3
AWS also offers its own AI accelerators (Trainium for training, Inferentia for inference), which sit outside the GPU rental scope of this comparison. SageMaker, Redshift, and the broader managed-services catalog are common reasons enterprises pick AWS despite the GPU rate premium.
Quota approval is required for most GPU instance types. We received a quota for all H100 and A100 types within a day of applying in our trial.
Microsoft Azure
Azure posts H100 through the ND H100 v5 series (H100 SXM); smaller H100 PCIe configurations are available through the NC-series. The catalog spans 10 GPU model families and includes B200 (ND B200) and AMD MI300X (ND MI300X v5).4 5
All three billing tiers are publicly listed. Azure has also been building its own AI accelerator program (Maia) for in-house training workloads; those chips are not rentable through the standard GPU instance API.6
Google Cloud Platform
GCP posts the cheapest H100 in the hyperscaler tier, but the listing collapses three SKUs (a3-highgpu, a3-megagpu, a3-edgegpu) under one row in public catalog snapshots. The A3 Mega variant typically lists at ~$14.19/GPU/hr while A3 Standard sits at ~$11.06, and the visible median moves as one variant enters or leaves the public listing. The catalog spans 10 GPU model families and includes B200 through the A3 Ultra family.7
All three billing tiers are publicly listed. GCP also offers TPU accelerators (v5p, v6e, Trillium) as a separate product line outside the GPU rental scope of this comparison.
Oracle Cloud Infrastructure
OCI uses a bare-metal-first approach: most GPU offerings run directly on the host hardware without a hypervisor layer. The catalog spans 13 GPU model families, including AMD MI300X and MI355X. Among hyperscalers, OCI’s bare-metal-by-default and RoCE v2 cluster networking are differentiators for tightly-coupled multi-node training workloads. Cohere, an early customer, runs LLM training on OCI clusters; Oracle has also invested in Cohere as a strategic backer.8
Other hyperscaler-tier general clouds
OVHcloud (France-based), Scaleway (France, with the Nabu 2023 supercomputer holding 1,016 H100 GPUs), DigitalOcean, Vultr, and Linode/Akamai round out the hyperscaler tier. These are general-purpose cloud platforms with GPU rental as one component. The European-headquartered ones (OVHcloud, Scaleway, IONOS) are positioned for EU data residency and sustainability claims; Scaleway operates entirely on renewable energy across three EU regions.9 10 11 12 13 14
Alibaba Cloud is the only major hyperscaler with Chinese mainland availability. The catalog is narrower (4 GPU model families) and US/EU enterprises with regulated workloads typically rule it out on jurisdiction grounds.15
Neocloud providers
Neoclouds focuses on GPU and AI infrastructure as its core product. They typically undercut hyperscaler pricing by 50-80% for the same GPU because they skip the broad-platform overhead. The trade-off is a narrower service catalog: compute and basic storage are well covered; identity, managed databases, and cross-service integration are not.
Lambda Labs
Lambda Labs claims to serve over 10,000 research teams. The catalog spans 8 GPU model families and is GPU-only by design. Lambda Cloud comes pre-equipped with PyTorch, TensorFlow, CUDA drivers, and a Jupyter notebook per instance, closer to “click-and-train” than other neoclouds. Lambda also sells GPU hardware directly (desktops, servers), with the historical roots of the company. Pricing is on-demand only in the public listing; multi-week and multi-year commitments are quote-based.16
CoreWeave
CoreWeave is the largest specialist neocloud and was selected as NVIDIA’s first Elite cloud services provider. The company claims 45,000 GPUs across its data centers and counts NVIDIA among its investors. Two billing tiers are exposed (on-demand and spot). The catalog spans 10 GPU model families, including B200 and B300.17
CoreWeave’s ARENA program (AI-Ready Native Applications) lets customers benchmark production-scale workloads against real infrastructure before committing to capacity. The pricing sits closer to the hyperscaler tier than other neoclouds, reflecting the higher-end enterprise positioning.
RunPod
RunPod operates two tiers: Secure Cloud (dedicated bare-metal) and Community Cloud (shared bare-metal at lower rates with no SLA). The catalog spans 18 GPU model families, including AMD MI300X. Three billing tiers are publicly exposed (on-demand, spot, and reserved). Instance startup is sub-minute, the fastest in our measurements.18
Recent updates include GitHub-release rollback for Serverless endpoints, load-balancing endpoints in beta, and Vercel AI SDK integration via the @runpod/ai-sdk-provider package. The Public Endpoints catalog covers text, image, video, and audio models with pre-built deployment templates.
Crusoe
Crusoe runs data centers on stranded and flared natural gas, a cost and emissions arbitrage that funds aggressive H100 and B200 capacity buildout. The catalog spans 9 GPU model families, including AMD MI300X.19
FluidStack, Hyperstack, Nebius
FluidStack aggregates GPU capacity from multiple data center operators. Hyperstack is one of the lowest-priced H100 sources in the neocloud tier with a UK-based footprint and three billing tiers. Nebius is European-headquartered (Netherlands) with leading-edge B200 and B300 in its catalog.20 21 22
Paperspace by DigitalOcean
Paperspace was acquired by DigitalOcean and claims to serve over 650,000 users. The catalog spans 12 GPU model families. The pre-loaded Jupyter notebook interface and visual instance management are the historical differentiators; advanced users typically replace the GUI with native Jupyter or SSH workflows.23
Other specialist neoclouds
TensorDock, CUDO Compute, Hot Aisle (AMD MI300X focus), Sesterce, Lyceum, Cirrascale (reserved-only, includes Cerebras and Graphcore options), Together (inference-tier), and Replicate (serverless) round out the specialist tier. Most run on-demand-only and target small-to-mid AI development teams.24 25 26 27 28 29 30 31
Datacrunch and Seeweb (European specialists)
Datacrunch is a Finland-based neocloud running on 100% renewable energy with H100, A100, RTX 6000, and V100 in groups of 1, 2, 4, or 8. Seeweb is an Italian neocloud also running on 100% renewable energy with five GPU model families and Terraform support for infrastructure-as-code workflows.32 33
Community marketplaces
Community marketplaces aggregate GPU capacity from many small operators, often at the floor of the publicly listed price spread. The trade-off is variability: shared bare metal, less consistent uptime SLA, and inventory that depends on how many host operators are online at request time.
Vast.ai
Vast.ai aggregates 42 GPU model families and 106 distinct multi-GPU configurations, the deepest catalog among all providers we track. Three billing tiers are exposed. The marketplace bids inventory across many small host operators, which means a price quote on the dashboard reflects current availability and may not hold five minutes later. The catalog also includes legacy hardware (GTX 1080, K80 era) that no other tracked provider lists, useful for cost-driven experimentation and academic workloads.34
Salad
Salad runs on distributed consumer hardware (gaming PCs participating in the network during idle time). H100 is not publicly listed; the catalog leans toward RTX 4090, RTX 5090, and other consumer-tier cards at the floor of those GPU classes across all providers.35
Theta EdgeCloud
Theta EdgeCloud spans 28 GPU model families across an edge-network footprint. The edge-distributed architecture is the differentiator for region-aware inference; pricing is on-demand only, and inventory varies by edge node.36
Deployment models
GPU rental services arrive in three deployment shapes. Each shape trades off control for convenience.
Serverless GPU
Serverless GPU services manage provisioning, scaling, and tear-down on the buyer’s behalf. The provider charges per-second or per-millisecond of actual GPU use; idle time is free. The shape suits sporadic workloads, batch inference, and bursty generative-AI applications where average utilization is low.
Common serverless GPU providers include Replicate, RunPod Serverless, Modal, Fal.ai, and Together. Throughput per dollar typically beats provisioned GPU when utilization is under 30-40%; above that threshold, on-demand or reserved GPU instances are cheaper.37 38 39 40 41
Virtual GPU (vGPU)
Virtual GPUs are the most common shape. A hypervisor partitions a physical GPU into one or more virtual slices, each running inside a virtual machine. All major hyperscalers and most neoclouds default to this shape. The trade-offs: predictable cost, broad availability across providers, and slight latency overhead from the virtualization layer.
Bare-metal GPU
Bare-metal GPU services deliver a dedicated physical GPU server with no virtualization layer. The buyer gets direct hardware access for maximum performance and minimum latency. The shape fits large training runs, HPC workloads, and any case where virtualization overhead matters. OCI, CoreWeave, and Lambda Labs all offer bare-metal options. AWS and Azure expose it through specific instance families (p5d on AWS, ND-series on Azure).
FAQs
Hyperscalers run broad cloud platforms with GPU rental as one product line among many. Specialist neoclouds focus on GPU and AI infrastructure as their core product. Hyperscalers price 3-6x above neoclouds for the same GPU; the gap reflects bundled enterprise services rather than raw silicon. For sustained price-trend comparison across tiers, see the Cloud GPU Rental Price Index.
Use serverless when average GPU utilization is under 30-40%, when workloads are bursty, or when ops overhead is a higher cost than per-hour rate. Provisioned GPU on a neocloud is cheaper at sustained high utilization.
For workloads with EU data-residency requirements or buyers serving EU customers on latency-sensitive paths, yes. IONOS, OVHcloud, Scaleway, Nebius, Datacrunch, and Seeweb are EU-headquartered options. Prices typically match or slightly exceed US-based neoclouds; the premium is for residency, jurisdiction, and sustainability claims rather than for raw compute.
Further reading
- Multi-GPU Benchmark: B200 vs H200 vs H100 vs MI300X
- Top 30 Cloud GPU Providers & Their GPUs
- GPU Concurrency Benchmark
- Top 25+ AI Chip Makers: NVIDIA & Its Competitors
- Cloud GPU Rental Price Index
Reference Links
- Has 20 years of experience as a white-hat hacker and development guru, with extensive expertise in programming languages and server architectures.
- Is an advisor to C-level executives and board members of corporations with high-traffic and mission-critical technology operations like payment infrastructure.
- Has extensive business acumen alongside his technical expertise.
Comments 4
Share Your Thoughts
Your email address will not be published. All fields are required.
Nice article, Cem! Could you add Koyeb and a few other serverless GPU providers?
Sure, thank you for the suggestion, we will consider it in the next edit.
Hi Cem, please also check out Dataoorts at https://dataoorts.com. We'd greatly appreciate being listed here.
Sure, we'll review to see if we can include Dataoorts in the next edit.
Hi Cem, we just launched Atlascloud.ai with the lowest H100 pricing on internet 2.48 on demand. Would love to get on your list.
Sure, we'll be reaching out to understand what Atlascloud.ai is offering.
Where is Nebius.ai ???
Thank you! It is added now.