5 Best Open-Source LLMs in 2023 (Two-Minute Guide)

Open-source research on large language models (LLMs) is crucial for democratizing this powerful technology.

Although open-source LLMs are now widely used and studied, they faced initial challenges and criticism. Early attempts at creating open-source LLMs like OPT and BLOOM had poor performance compared to closed-source models.

This led researchers to realize the need for higher-quality base models pre-trained on larger datasets with trillions (!) of tokens!

OPT: 180 billion tokens
BLOOM: 341 billion tokens
LLaMa: 1.4 trillion tokens
MPT: 1 trillion tokens
Falcon: 1.5 trillion tokens
LLaMA 2: 2 trillion tokens

However, pre-training these models is expensive and requires organizations with sufficient funding to make them freely available to the community.

This article focuses on high-performing open-source base models significantly improving the field. A great graphic of the historic context of open-source LLMs is presented on the Langchain page:

How can we determine the best of those? Easy, with Chatbot leaderboards like this on Hugginface:

At the time of writing, the best non-commercial LLM is Vicuna-33B. Of course, closed-source GPT-4 by OpenAI and Claude by Anthropic models are the best.

By the way, feel free to check out my article on Claude-2 proven to be one of the most powerful free but closed-source LLMs:

The introduction of LLaMA 1 and 2 was a significant step in improving the quality of open-source LLMs. LLaMA is a suite of different LLMs with sizes ranging from 7 billion to 65 billion parameters. These models strike a balance between performance and inference efficiency.

LLaMA models are pre-trained on a corpus containing over 1.4 trillion tokens of text, making it one of the largest open-source datasets available. The release of LLaMA models sparked an explosion of open-source research and development in the LLM community.

Here’s a couple of open-source LLMs that were kicked off after the release of Llama: Alpaca, Vicuna, Koala, GPT4All:

LLaMA-2, the latest release, sets a new state-of-the-art among open-source LLMs. These models are pre-trained on 2 trillion tokens of publicly available data and utilize a novel approach called Grouped Query Attention (GQA) to improve inference efficiency.

MPT, another commercially-usable open-source LLM suite, was released by MosaicML. MPT-7B and MPT-30B models gained popularity due to their performance and ability to be used in commercial applications. While these models perform slightly worse than proprietary models like GPT-based variants, they outperform other open-source models.

Falcon, an open-source alternative to proprietary models, was the first to match the quality of closed-source LLMs. Falcon-7B and Falcon-40B models are commercially licensed and perform exceptionally well. They are pre-trained on a custom-curated corpus called RefinedWeb, which contains over 5 trillion tokens of text.

You can currently try the Falcon-180B Demo here.

📈 TLDR: Open-source LLMs include OPT, BLOOM, LLaMa, MPT, and Falcon, each pre-trained on extensive tokens. LLaMa-2 and Falcon stand out for their innovative approaches and extensive training data.

👉 For the best open-source LLM, consider using Vicuna-33B for its superior performance among non-commercial options.

Also, make sure to check out my other article on the Finxter blog: 👇

🔗 Recommended: Six Best Private & Secure LLMs in 2023