Positron Software Release Competitiveness
September 2024
December 2024
Q1 2025
Q2 2025
Software Release
Relative Perf vs H100
Perf/Watt vs H100
Perf/$ vs H100
Perf/$ vs B200 (Estimated)
v1.1 (Atlas)
1.43x
3.6x
3.4x
-
v1.2 (Atlas)
1.77x
6.0x
4.3x
3.3x
v2.0 (Atlas)
2.37x
7.3x
5.7x
3.5x
v2.1 (Atlas)
3.15x
8.9x
7.6x
4.7x
* Nvidia performance is based on vLLM 0.6.3 based on the average across testing Mixtral 8x7B, Llama 3.2 3B, Llama 3.1 8B, and Llama 3.1 70B.
Every Transformer Runs on Positron
Supports all Transformer models
seamlessly with zero time and zero effort
Positron maps any trained HuggingFace Transformers Library model directly onto hardware for maximum performance and ease of use
Step 1
.pt
.safetensors
Develop or procure a model using the HuggingFace Transformers Library
Step 2
Drag & Drop to Upload
or
Upload or link trained model file (.pt or .safetensors) to Positron Model Manager
Step 3
from openai import OpenAI client = OpenAI(uri="api.positron.ai") client.chat.completions .create( model="my_model" )
Update client applications to use Positron's OpenAI API-compliant endpoint
01

02

03
