OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip

OpenAI and Broadcom on June 24, 2026 unveiled Jalapeño, OpenAI’s first custom-designed silicon — an “Intelligence Processor” built from scratch for large language model inference. The chip is the first product of the companies’ 10-gigawatt accelerator partnership announced last October, and OpenAI says it was taken from initial design to manufacturing tape-out in just nine months, a cycle the company believes is the fastest ever for high-performance semiconductors.
Intermediate
What Jalapeño Is
Jalapeño is a “blank-slate” accelerator designed specifically for LLM inference — the act of running a trained model to serve answers — rather than a general-purpose AI chip adapted from earlier workloads. OpenAI designed the architecture around its own understanding of how frontier models behave: the kernels, memory-movement patterns, networking, and serving systems behind ChatGPT, Codex, and its API. Broadcom (NASDAQ: AVGO) handled the silicon implementation and networking, including its Tomahawk networking silicon, while Canadian manufacturer Celestica contributed board, rack, and system integration.
The stated goal is to combine the throughput of today’s leading accelerators with latency closer to specialized inference systems — making the chip well suited for interactive products at scale. Crucially, OpenAI says Jalapeño is built to run all LLMs across the industry, not just its own. Engineering samples are already running real ML workloads in the lab at production target frequency and power, including the GPT-5.3-Codex-Spark model.
Performance and the Nine-Month Sprint
OpenAI says early testing shows Jalapeño will deliver “performance per watt substantially better than current state-of-the-art,” with a detailed technical report promised in the coming months. The architecture’s efficiency comes from reducing data movement and balancing compute, memory, and networking so realized utilization lands much closer to theoretical peak — a recurring bottleneck for GPU-based inference.
The most striking claim is the development speed. A nine-month design-to-tape-out cycle for a high-performance ASIC is extraordinarily fast; such projects typically take years. OpenAI attributes this to deep software-hardware co-development with Broadcom and — notably — the use of its own AI models to accelerate parts of the design and optimization process. As the company frames it, the same models served to users are now helping build the infrastructure that will run future models.
“Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers,” said Richard Ho, who leads OpenAI’s hardware program. “We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models.”
What This Means
Jalapeño extends OpenAI’s strategy of owning its full stack — from products to models and now to chips. By designing the hardware itself, OpenAI can co-optimize every layer toward the same goal: faster, cheaper, more reliable inference. President and co-founder Greg Brockman called the chip “part of our long-term full-stack infrastructure strategy to make compute more abundant.”
For the broader market, the move adds pressure on Nvidia’s pricing power in AI accelerators by giving a major buyer a custom alternative for inference. Jalapeño is the first step in a multi-generation platform targeting initial deployment by the end of 2026, scaling to gigawatt-class data centers with Microsoft and other partners. Broadcom CEO Hock Tan described it as “just the beginning of a multi-generation roadmap.” If AI-assisted chip design continues to compress development timelines, it could lower the cost of compute across the industry — and reshape who controls the hardware underneath frontier AI.
Related Coverage
- Nvidia Introduces Lower-Cost Blackwell AI Chip for China Amid Export Restrictions — context on the GPU market Jalapeño aims to challenge.
- OpenAI Opens GPT-5.5-Cyber to Vetted Defenders via Trusted Access — recent OpenAI product news.
Sources
- OpenAI — OpenAI and Broadcom unveil LLM-optimized inference chip (June 24, 2026)
- OpenAI — OpenAI and Broadcom announce 10-gigawatt strategic collaboration (October 13, 2025)
- StartupHub.ai — OpenAI Unveils Custom AI Chip
- Data Center Dynamics — OpenAI and Broadcom to develop and deploy 10GW of custom AI accelerators



沪公网安备31011502017015号