Cerebras’ dinner plate-sized chip challenges Nvidia GPU

| Akhil George and Sujit John | TNN | Oct 18, 2023, 09:14 IST

A computer chip is usually the size of a fingernail. Cerebras, a company founded in 2016 in the heart of Silicon Valley, has built a chip the size of a dinner plate – the largest chip ever built. It’s called Wafer Scale Engine (WSE). “We created the largest square that you can cut out of a 300-millimetre wafer,” says the company’s founder and CEO Andrew Feldman.

Tired of too many ads?go ad free now

Cerebras did it with the idea of building dedicated AI chips, and not just repurpose graphics processing units (GPUs) for AI. GPUs, particularly Nvidia’s, are among the most sought-after technological products of our time due to its ability to process multiple computations simultaneously – invaluable for training deep-learning models. But, like Feldman says, GPUs were not originally meant for this narrow use case, they were built to accelerate the rendering of 3D graphics.

One advantage of a large chip for AI, says Feldman, is that it makes running large models easier since you don’t have to write tens of thousands of lines of code for a model with billions of parameters, saving months of engineering work. Big chips also just process more data quickly.

Historically, a single wafer has been split into thousands of smaller chips. Many such chips are linked together to create a supercomputer. But why create small chips and painstakingly combine them with expensive, slow, power hungry switches? Why haven’t others created a single large chip?

“It was an extraordinarily hard technical problem,” says Feldman. Many indeed had tried it before. “Gene Amdahl, one of the fathers of the field (and who was an IBM scientist for many years), started a company in the mid-80s to try and do this, and he failed,” says Feldman.

Tired of too many ads?go ad free now

Cerebras had to solve a number of complex, technical problems, one of which was the problem of yield – which refers to the percentage of good chips produced in a manufacturing process. “In every wafer there are a collection of flaws in the crystal and structure of the wafer. When the lithographic process puts transistors on top of that flaw, the transistor fails. Previous designs solved this problem by cutting small chunks of the wafer and finding those that didn’t have flaws. Ones that did have flaws were thrown away. As your chip got bigger, the probability that you hit one of these flaws increased. So, the cost of big chips was super linear to size. We invented a technique to manage that problem.”

Earlier this year, Cerebras announced a 4 exaFLOP supercomputer. And today, it has many customers, including the largest pharma and oil companies, on its cloud using multiple exaFLOPs at a time. GlaxoSmithKline, Feldman says, has published that what they used to do (modelling of epigenomics) in 24 days on 16 GPUs, they were able to do with one Cerebras system in two and a half days.

Building LLMs
One big customer is the UAE consortium G42, which has used the Cerebras supercomputer to build Juse – a large language model (LLM) in Arabic. Most big language models are today in English, but countries are eager to build local language LLMs for strategic rea sons. “We trained the largest Arabic language model in the world, with about 13 billion parameters, and we made it available worldwide under an open-source licence. This is our view of how you democratise AI,” he says.

To create an LLM for Indian languages, he says, the first thing to do is learn a little bit about Indic languages. “Like Arabic and Hebrew, Indic languages can’t be encoded cleanly with a single bit, you need what’s called a two-bit encoding. And that’s because there are vowels that change the sound of consonants. And we have spent a great deal of time in finding the right encoding structure and embedding structure for non-English language LLMs,” he says.

Once that’s done, you have to figure out ways to augment the datasets. And once the dataset is ready, you have to figure out what algorithms would take advantage of the uniqueness and the differences between English and Indic languages, says Feldman.

Don’t use only NvidiaFeldman warns India against using only Nvidia GPUs to build what many experts believe is critical – a sovereign cloud. “A sovereign Nvidia cloud is an oxymoron. You are then entirely dependent and not sovereign at all,” he says, adding “When it was Intel alone, it was bad for consumers. If it’s Nvidia alone, it will be bad for consumers.” There’s a massive battle that’s underway to beat Nvidia, he says, noting that Google and Amazon have built their own chips, and companies like Cerebras are winning hundreds of millions of dollars of business. “In the US, where the government subsidises and purchases some of the largest supercomputers in the world, they had a policy of alternating between Nvidia and AMD, so that they were not dependent on a single vendor. And then they began adding others. And so we are now deployed in the super-compute infrastructure as well,” he says.

Stay informed with the latest Business News on Times of India. Explore updates on International Business, gain insights with Financial Literacy tips, and make use of Financial Calculators. Don’t forget to check the list of Bank Holidays in 2025, including Bank Holidays in January.

Top Comment

Rajeev Vadjikar

450 days ago

A wafer scale engine is customized computer on a wafer, which would be much more expensive than interconnecting GPUs and other off the shelf components. The failure of any region of wafer would make the whole wafer scale engine unusable.

Follow Us On

Cerebras’ dinner plate-sized chip challenges Nvidia GPU

Follow Us On Social Media