“It is actually possible to do both in the same chip. You take the HBM, you take the SRAM, put them together in the same chip. You put the weights in SRAM and you put the inference data in HBM. That is what we are doing in fact.”
- Reiner Pope, Co-founder and CEO of MatX
Uncover the groundbreaking hardware innovations poised to redefine AI. Reiner Pope, former Google TPU architect, reveals how MatX is designing chips specifically for LLMs, tackling critical bottlenecks in latency and throughput. Get an exclusive look into the future of AI computation.
Achieving both latency and throughput
Startups take bigger risks
Solving every puzzle piece
LLM market is different
TPUs vs. Nvidia GPUs
Labs want your IP
Peak performance is crazy














