← Back to Intelligence

Fifth-Generation Tensor Cores (Blackwell)

Date: March 18, 2024
Company: NVIDIA
Category: Hardware & Infrastructure

Narrative

FP4 precision support. 20 petaflops per GPU. Second-gen Transformer Engine. 2.5x performance vs Hopper Tensor Cores.

NVIDIA

Reality

FP4 performance delivered. Inference cost reductions verified. Precision scaling required software tuning. Sparse performance rarely achieved in practice.

Implication

Ultra-low precision validated for inference. Models requiring 8 H100s ran on 2 B200s. Inference economics fundamentally changed. Precision progression continues: FP32 to FP16 to FP8 to FP4.

Tags

  • nvidia
  • gpu
  • chip-design
  • inference