4 Minutes
Data centers rarely change overnight. But when they do, you notice the hum. The racks get denser. Latency drops. Costs get rearranged. Nvidia is betting that hum will soon have a new name: Vera.
Nvidia says Vera delivers roughly 1.8 times the performance of leading x86 chips. That claim is the headline. The hardware behind it is the conversation starter. Vera is the CPU half of the Vera Rubin platform, pairing an ARM-based CPU with a Rubin GPU for workloads that need huge memory bandwidth and tight CPU-GPU coordination.
Why Vera reshapes AI server thinking
Vera is built around 88 Olympus cores with Spatial Multithreading, offering 176 threads per socket. Memory is not an afterthought: a single CPU can be paired with up to 1.5 terabytes of LPDDR5X, delivering about 1.2 terabytes per second of bandwidth. For AI inference and agentic models that chew through context and weights, that bandwidth is the measure of survival.

Think scale. Nvidia showed a Vera CPU Rack that packs 256 CPUs into a single chassis. That equals 22,528 cores and 45,056 threads. It is the kind of density cloud providers crave when trying to move large models out of expensive GPU-only islands and into more flexible, CPU-forward architectures.
Vera also plays well with Rubin GPUs. The NVL72 configuration pairs 36 Vera CPUs with 72 Rubin GPUs, and Nvidia touts a 1.8 terabyte per second NVLink-C2C interconnect between them. The point is not to replace GPUs but to rework the host-accelerator relationship so data flows faster and software sees fewer bottlenecks.
Use cases are familiar but growing: agentic AI, reinforcement learning, heavy analytics, and inference at scale. Vera can act as a standalone compute node for those jobs, or as the host that keeps Rubin GPUs fed and synchronized.

Adoption is already underway. Anthropic, OpenAI and SpaceXAI have committed to the platform for their model workloads, and hyperscalers such as ByteDance, CoreWeave and Oracle Cloud Infrastructure are on board. On the systems side, Dell, HP, Lenovo and Supermicro will offer Vera-based servers. Major manufacturers including Asus, Compal, Foxconn, Gigabyte, Pegatron, Quanta Cloud Technology, Wistron and Wiwynn will produce hardware built around the chip.
Even nontraditional customers are taking notice. The New York Stock Exchange, which processes roughly 1.1 trillion messages a day, is exploring Vera with partners Redpanda and HP to rethink latency-sensitive infrastructure. That kind of interest shows the platform is being evaluated for more than just model training—it's being looked at for real-time, high-throughput systems where every microsecond matters.
For Nvidia, Vera extends a familiar playbook: take learnings from GPU-first AI deployments and apply them to CPU design. The company previously folded its AI work into products like RTX Spark, which brought Grace CPUs and Blackwell GPUs with LPDDR5X memory into the spotlight. Now the conversation has shifted from single-node GPU performance to system-wide balance and throughput.
Will Vera unseat x86 in the data center? Not overnight. But the architecture targets specific pain points for AI workloads: memory bandwidth, thread density, and fast CPU-GPU interconnect. For engineers and architects wrestling with model costs and throughput, that is a practical beginning.
Source: gsmarena
Comments
Eren
Is that 1.8x vs x86 on real models or just paperbenchmarks? Vendors love cherry picking, so I'm skeptical, but also wanna see proper tests..
coreflux
Whoa, Vera racks sound like datacenter porn - 22k cores in one chassis?? If bandwidth is really this high, latency nightmares might melt away. hyped, but curious
Leave a Comment