MatX, Led by Google TPU Veteran, Secures $500M to Revolutionize AI Chips for LLMs

Reiner PopeCo-founder and CEO of MatX

AI ChipsMatXLLM HardwareReiner PopeSemiconductor IndustryDeep Learning

MatX aims to build chips with industry-leading throughput and latency for large language models.
The startup's architecture combines HBM and SRAM, alongside large systolic arrays and low-precision arithmetic.
A recent $500 million Series B round, led by Jane Street and Situational Awareness, will fund manufacturing and supply chain ramp-up.

Reiner Pope, co-founder and CEO of MatX and a former Google TPU architect, is spearheading a new era in AI hardware, designing chips specifically optimized for large language models (LLMs). His company recently announced a significant $500 million Series B funding round, signaling a major push to bring these advanced chips to market.

Pope, a veteran of Google's foundational AI period, including the development of the original TPU, highlights Google's early foresight in designing chips for neural nets rather than graphics. This deep understanding of AI workloads forms the bedrock of MatX's strategy. The company's core mission is to overcome the inherent trade-offs in existing AI chips, which typically force a choice between high throughput (cost-efficiency) and low latency (responsiveness). MatX claims to achieve both, a critical advantage for the rapidly evolving LLM landscape where both cost per token and instantaneous responses are paramount.

Key Moment

Peak performance is crazy

MatX's innovative architecture integrates both High Bandwidth Memory (HBM) and Static Random-Access Memory (SRAM) onto the same chip. This unique combination allows for the storage of inference data in HBM and weights in SRAM, effectively delivering both the massive memory capacity needed for throughput and the ultra-fast access required for low latency. Furthermore, MatX is pushing the boundaries with massive systolic arrays for efficient matrix multiplication and pioneering new approaches to low-precision arithmetic, moving towards 4-bit precision to maximize efficiency without sacrificing model quality.

Key Moment

Achieving both latency and throughput

The journey to market, however, is fraught with challenges. Pope candidly discusses the intricate global supply chain for AI chips, identifying potential bottlenecks in logic dies (TSMC), HBM (SK Hynix, Samsung, Micron), and even basic rack infrastructure and data center power. As a startup competing with tech giants, MatX's strategy involves securing ironclad buyer contracts to demonstrate demand to suppliers, a move bolstered by their substantial new funding. Pope also touches on the unique economics of the LLM market, where a few frontier labs with massive budgets are willing to invest heavily in custom software optimization for new hardware, unlike the broad compatibility required in the gaming industry.

Key Moment

LLM market is different

Looking ahead, Pope foresees continued innovation in model architecture, especially as hardware capabilities evolve. He emphasizes the importance of making AI cheaper and faster, not just for scaling existing applications but for enabling entirely new ones. MatX's internal ML research team, focusing on numeric and attention mechanisms from scratch, exemplifies their commitment to co-designing hardware and software for optimal performance. Pope's overarching philosophy, rooted in a love for optimization, suggests a future where AI's true potential is unlocked by meticulously engineered hardware and a deep understanding of computational efficiency.

Key Moment

Solving every puzzle piece

“It is actually possible to do both in the same chip. You take the HBM, you take the SRAM, put them together in the same chip. You put the weights in SRAM and you put the inference data in HBM. That is what we are doing in fact.”
- Reiner Pope, Co-founder and CEO of MatX

More Articles