Name: NVIDIA H200 141GB GPU
Brand: NVIDIA
Availability: InStock

About the NVIDIA H200 141GB

The NVIDIA H200 is the memory-expanded successor to the H100, released in late 2024. Same compute as H100 but with 141 GB of faster HBM3e memory (vs H100's 80GB HBM3). The extra memory is the killer feature — Llama 3 70B inference fits in a single H200 without sharding, dramatically simplifying serving infrastructure.

Specs

Memory

141 GB HBM3e

Bandwidth

4.8 TB/s

Tensor cores

528 (4th gen)

FP16 (peak)

1,979 TFLOPS

Architecture

Hopper

Released

Q4 2024

What's it good for?

Single-GPU inference for 70B+ models — fits Llama 3 70B without tensor parallelism.
Long-context inference — large KV cache fits in HBM3e memory.
Mixture-of-Experts training — MoE models with large expert routing benefit from H200's bandwidth.
Memory-bound workloads — anything that hit OOM on H100 is the H200's sweet spot.

When to use H200 vs alternatives

H200 vs H100: H200 has 76% more memory and 60% more bandwidth. Same compute. Worth the ~20% price premium if your model is memory-bound.
H200 vs B200: B200 is the next-gen Blackwell card with 2.5× compute and 192GB. H200 is cheaper and adequate for most workloads; B200 is overkill unless you're training frontier models.
H200 vs 2× H100: 2× H100 has 160GB total but requires NVLink for fast tensor parallelism. Single H200 is simpler and often cheaper.

Cheapest H200 GPU rental, right now

Live H200 prices across providers

About the NVIDIA H200 141GB

Specs

What's it good for?

When to use H200 vs alternatives

Get pinged the moment H200 prices drop