Nvidia Rubin

January 09, 2026

NVIDIA names its major GPU architectures after scientists (e.g., Kepler, Pascal, Volta, Ampere, Hopper). Rubin follows this convention and is named after Vera Rubin, the astronomer.

Rubin = architecture / generation
Actual GPUs will have different product names built on the Rubin architecture

Analogy with earlier generations

To make this concrete:

Hopper → Architecture
- Products: H100, H200
Blackwell → Architecture
- Products: B100, B200
Rubin → Architecture
- Products: R100, R200, etc. (exact names not finalized publicly)

What “Rubin” represents technically

Rubin refers to an entire compute platform, not just a chip:

New GPU microarchitecture
New tensor core generation
Likely paired with next-gen CPUs, networking (NVLink), and memory
Targeted primarily at AI training/inference and HPC

NVIDIA has also indicated that Rubin will come in multiple configurations (e.g., standard Rubin and Rubin Ultra), similar to Hopper vs. Hopper Superchips.

Summary

Rubin is an architecture name
Not a specific GPU SKU
Individual GPUs based on Rubin will have separate model names

NVIDIA Rubin is the GPU microarchitecture scheduled for release in 2026, succeeding Blackwell. It was first announced at Computex 2024 and later detailed at CES 2026. It’s named after astrophysicist Vera Rubin.

Rubin is part of a six‑chip AI supercomputing platform that includes:

Rubin GPUs
Vera CPUs
HBM4 memory
A next‑generation interconnect fabric

Rubin is designed for the new era of AI factories—always‑on systems that continuously generate intelligence at scale, with massive context windows and multimodal reasoning workloads.

⚙️ Key Technical Highlights of Rubin

1. Massive Performance Jump

50 PFLOPS FP4 performance for Rubin
100 PFLOPS FP4 for Rubin Ultra Compared to Blackwell’s ~20 PFLOPS FP4, this is a 2.5× to 5× increase.

2. Built on TSMC 3nm (3NP/3PN)

More transistors
Better power efficiency
Higher density

3. HBM4 Memory

Higher bandwidth
Larger capacity
Lower power per bit HBM4 is a major leap over Blackwell’s HBM3e.

4. Six‑Chip Architecture

The Rubin platform integrates:

1 Vera CPU
2 Rubin GPUs
HBM4 stacks
High‑speed interconnect
System controller

This design dramatically improves latency, bandwidth, and scalability for AI supercomputing workloads.

5. Designed for Long‑Context AI

Rubin is optimized for:

100k+ token context windows
Agentic reasoning
Multimodal pipelines
Real‑time inference at scale

🔥 How Rubin Improves on Blackwell (Side‑by‑Side)

Feature	Blackwell (2024–25)	Rubin (2026)	Improvement
Process node	TSMC 4N	TSMC 3NP/3PN	Smaller, more efficient
FP4 performance	~20 PFLOPS	50 PFLOPS (100 PFLOPS Ultra)	2.5×–5×
Memory	HBM3e	HBM4	Higher bandwidth & capacity
Architecture	2‑chip GPU module	6‑chip AI platform	Higher integration & throughput
AI focus	Training + inference	AI factories, long‑context, multimodal	Next‑gen workloads
Power efficiency	High	Significantly improved	Lower cost per token

Sources:

🧠 Why Rubin Matters

Rubin isn’t just a faster GPU—it’s a full AI supercomputing architecture built for the next wave of AI:

Autonomous agents
Long‑context LLMs
Multimodal reasoning
Continuous training + inference
Enterprise‑scale AI factories

It’s designed to deliver more intelligence per watt, per dollar, and per rack than any previous NVIDIA platform.

Search This Blog

HPCNuggets