POSITRON
ATLAS
Transformer Inference Appliance
5x Better Performance per Dollar vs DGX-H100
Highlights:
Batch 1 → 480 tokens/sec/user
Batch 8 → 160 tokens/sec/user
Available Spring 2024
*All performance estimates are subject to change. All numbers are based on BF16 computation, and without speculative decoding or paged attention.