About Positron

We're building the future of AI acceleration, making advanced machine learning accessible and efficient for everyone with leading performance per dollar and energy efficiency.

Designed, fabricated, and assembled in the United States.

Our Team

Positron's team has over 400 years of AI, systems, silicon, and cloud experience combined.

Visit Careers

Our Investors

These are the people who had the guts to back us despite facing trillion-dollar incumbents.

A bit about our progress

Positron prides itself in accomplishing remakable things quickly and efficiently. Since our founding in the spring of 2023:

Month 8:

First prototype running LLama-2 7B running through a FPGA with less than 10 people and $6M raised

Month 15:

Built and shipped first generation product -Atlas- with 15 people and less than $12M raised

Month 18:

Ranked #3 on The Information's 50 Most Promising Startups in 2024

Month 21:

Recruited new CEO, one that built one of the largest GPU neoclouds from $0 to over $500M in ARR

Month 22:

Deployed first full scale production rack to major cloud provider

Month 24:

Positron Atlas now being used by leading networking, gaming, content moderation, CDN, and Token-as-a-Service companies

Month 26:

Raised $50M+ Series A from world class investors (see above) that are some of the biggest backers of the largest AI infrastructure and foundation model companies

Month 32:

Demonstrated 3x lower end-to-end latency for trading inference workloads versus comparable H100 systems while also consuming 1/3rd of the power.

Month 34:

Raised $230M+ Series B from world class investors, including new strategic investors, to scale our roadmap to next-generation Asimov silicon and Titan systems.

What gets us out of bed every morning (besides coffee)

We started Positron for one reason: to stop GPUs from bankrupting everyone trying to deploy AI.

Turns out, inference—the unsexy side of AI that powers billions of daily model queries—is now devouring budgets, frying electrical grids, and giving CTOs (and CFOs) migraines.

Our mission is simple and urgent: deliver inference that’s ridiculously efficient, surprisingly affordable, proudly made in America—and (finally!) makes GPUs optional.

A visionary, an applied mathematician, and an engineer walk into a bar
(our founding story)

The founders of Positron, and everyone who has joined since, can see that GPUs for inference are, well, CRAP*, and that there is a better, more efficient way to run the LLMs and AIs that we are all increasingly depending on. We see an exciting and significant opportunity before us to address this.

* our lawyers would like to point out that CRAP simply means: Chips Reaping Absurd Profits and has nothing to do with subjective quality or barnyard slang.

Everyone at Positron has their own unique intuition for this, and so over many nights, in many bars and hotels across the western U.S., through many napkin diagrams, countless disagreements (no fist fights; one pen thrown), and a few really key executable technical insights, Positron’s first product Atlas, was created on one of those napkins. We have since lost the napkin, but you can buy Atlas today.

Atlas is the world’s first LLM-inference-first accelerator, built in just 18 months from conception, manufactured, and fully supported right here in the U.S, designed with efficiency in mind at a time when the entire AI sector (which face it is most of the tech industry right now) is grappling with cost and energy constraints that are stalling growth. Unless, that is, you are lucky enough to be one of a few trillion-dollar companies and know how to reboot an old nuclear power plant.

But this is just the introduction to our story. We are building an even faster, more efficient second generation system called Titan. The ‘napkin’ for this is Atlas itself (note: it's harder to lose deployed Atlas servers). We are taking every piece of learning and insight Atlas is giving us to deliver an inference solution that unlocks near limitless context and the ability to concurrently hold large numbers of models and AI agents ready to instantly query. All enabled by terabytes of per-accelerator memory. And, we will deliver this in a form factor that doesn't require expensive liquid cooling, ridiculously over-provisioned boutique networking, and doesn’t force our customers to upgrade their infrastructure.

That last bit does sound like bar talk - a too-good-to-be-true elevator pitch. To ground this, we aren't claiming that we invented a fundamental new technology: “let me tell you all about in-memory-analog-wafer-sized-optical-quantum-mram-determisitic-approximate-inference”. No, that’s not us. But we are claiming that we have launched Positron on a trajectory that has earned us the right to crack open the inference market. By rapidly iterating on Atlas, we are taking a systematic approach to reassess and redefine what it takes to run LLMs and transformers. We are building inference focused silicon and a technology platform, informed by customers, that creates allies and partners, enabling all of us to break the vertical stranglehold that a certain pair of closely related GPUs force upon the market to overcome their fundamental inefficiencies.

As we have often explained to many bartenders, “we are just trying to save businesses from drowning in GPU invoices”. And they get it because the mocktail they just served was from a recipe made by ChatGPT, and their cousin in Florida told them to invest in Nvidia.

Check Out Our Socials

We don't post that much because we spend most of our time focused building the technological revolution that will reshape the trajectory of our civilization.