Cerebras-GPT vs. LLaMA AI Model Performance Comparison by freeqaz

Share This Article

Sed ut perspiciatis unde.

On March 28th, Cerebras released on HuggingFace a new Open
Source model trained on The Pile dataset called “Cerebras-GPT” with GPT-3-like
performance. (
Link to press release)

What makes Cerebras interesting?

While Cerebras isn’t as capable of a model for performing tasks when compared directly to models like LLaMA,
ChatGPT, or GPT-4, it has one important quality that sets it apart: It’s been released under the Apache 2.0 licence,
a fully
permissive Open Source license, and the weights are available for anybody to download and try out.

This is different from other models like LLaMA that, while their weights are freely available, their license
restricts LLaMAs usage to only “Non-Commercial” use cases like academic research or personal tinkering.

That means if you’d like to check out LLaMA you’ll have to get access to a powerful GPU to run it or use a
volunteer-run service like KoboldAI. You can’t just go to a website like you can with
ChatGPT and expect to start feeding it prompts. (At least without running the risk of Meta sending you a DMCA takedown
request.)

Proof-of-Concept to demonstrate Cerebras Training Hardware

The real reason that this model is being released is showcase the crazy silicon that Cerebras has been spending years
building.

A comparison of “one” Cerebras chip compared to an NVIDIA V100 chip.

These new chips are impressive because they use a silicon architecture that hasn’t been deployed in production for AI
training before: Instead of networking together a bunch of computers that each have a handful of NVIDIA GPUs, Cerebras
has instead “networked” together the chips at the die-level.

By releasing Cerebras-GPT and showing that the results are comparable to existing OSS models, Cerebras is able to
“prove” that their product is competitive with what NVIDIA and AMD have on the market today. (And healthy
competition benefits all of us!)

Cerebras vs LLaMA vs ChatGPT vs GPT-J vs NeoX

To put it in simple terms: Cerebras isn’t as advanced as either LLaMA or ChatGPT (gpt-3.5-turbo). It’s a
much smaller model at 13B parameters and it’s been intentionally “undertrained” relative to the other models.
Cerebras is ~6% of the size of GPT-3 and ~25% of the size of LLaMA’s full-size, 60B parameter model, and they
intentionally limited how long the model was trained in order to reach a “training compute optimal” state.

That doesn’t mean that

Cerebras-GPT vs. LLaMA AI Model Performance Comparison by freeqaz

Cerebras-GPT vs. LLaMA AI Model Performance Comparison by freeqaz

Share This Article

Newsletter

What makes Cerebras interesting?

Proof-of-Concept to demonstrate Cerebras Training Hardware

Cerebras vs LLaMA vs ChatGPT vs GPT-J vs NeoX

HackTech

Leave a comment Cancel reply

Editor's Choice

Cerebras-GPT vs. LLaMA AI Model Performance Comparison by freeqaz

Cerebras-GPT vs. LLaMA AI Model Performance Comparison by freeqaz

Share This Article

Newsletter

What makes Cerebras interesting?​

Proof-of-Concept to demonstrate Cerebras Training Hardware​

Cerebras vs LLaMA vs ChatGPT vs GPT-J vs NeoX​

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter

What makes Cerebras interesting?

Proof-of-Concept to demonstrate Cerebras Training Hardware

Cerebras vs LLaMA vs ChatGPT vs GPT-J vs NeoX