You can engage with LLMs in three ways:
- Hosted: Using platforms hosted by AI experts like OpenAI.
- Embedded: Integrating chatbots into tools like Google Docs or Office365.
- Self-hosting, either by building an LLM or tweaking open-source ones like Alpaca or Vicuna.
If you’re using a hosted or embedded solution, you’ll sacrifice privacy and security because your chat will be sent to an external server doing inference, i.e., asking the model to give an output. But if the data is on the external server, they have complete control of your data.
In this article, I’ll give you the six best LLMs preserving your privacy and security by allowing you to download them and run on your own machine. Let’s get started! 👇
Model 1: Llama 2
Llama 2 is open-source so researchers and hobbyist can build their own applications on top of it. If you download the model and self-host it on your computer or your internal servers, you’ll get a 100% private and relatively secure LLM experience – no data shared with external parties such as Facebook!
Llama 2 is trained on a massive dataset of text and code. Here’s a detailed benchmark, I highlighted the best Llama-2 model in red and the best models for each test in yellow. You can see that it outperforms even sophisticated models such as MPT and Falcon:
It even outperforms GPT-4 according to human raters and even as rated by GPT-4 itself:
Here are some initial references in case you’re interested: 👇
- Application: You can download and play with the model by completing a questionnaire here.
- Model Card: The model card is available on GitHub.
- Demo: You can try chatting with Llama 2 on Huggingface, however, this isn’t private and secure because it’s an online external model hosting service without encryption.
⚡ Note: Only if you download the powerful model to your computer or your internal servers can you achieve privacy and security!
Model 2: MPT Series (MPT-7B and MPT-30B)
MPT-30B (former: MPT-7B) is a large language model (LLM) standard developed by MosaicML, for open-source, commercially usable LLMs and a groundbreaking innovation in natural language processing technology.
It is private and secure! 👇
“The size of MPT-30B was also specifically chosen to make it easy to deploy on a single GPU—either 1xA100-80GB in 16-bit precision or 1xA100-40GB in 8-bit precision.” — MosaicML
With nearly 7 billion parameters, MPT-7B offers impressive performance and has been trained on a diverse dataset of 1 trillion tokens, including text and code. MPT-30B significantly improve on MPT-7B, so the model performance even outperforms original GPT-3!
As a part of the MosaicPretrainedTransformer (MPT) family, it utilizes a modified transformer architecture, optimized for efficient training and inference, setting a new standard for open-source, commercially usable language models.
Some interesting resources:
- Non-Private Demo (MPT-30B): https://huggingface.co/spaces/mosaicml/mpt-30b-chat
- Non-Private Demo (MPT-7B): https://huggingface.co/mosaicml/mpt-7b
- MPT-7B: https://www.mosaicml.com/blog/mpt-7b
- MPT-30B: https://huggingface.co/mosaicml/mpt-30b
- GitHub: https://github.com/mosaicml/llm-foundry/
Model 3: Alpaca.cpp
Alpaca.cpp offers a unique opportunity to run a ChatGPT-like model directly on your local device, ensuring enhanced privacy and security. By leveraging the LLaMA foundation model, it integrates the open reproduction of Stanford Alpaca, which fine-tunes the base model to follow instructions, similar to the RLHF used in ChatGPT’s training.
The process to get started is straightforward. Users can download the appropriate zip file for their operating system, followed by the model weights.
Once these are placed in the same directory, the chat interface can be initiated with a simple command. The underlying weights are derived from the alpaca-lora’s published fine-tunes, which are then converted back into a PyTorch checkpoint and quantized using
🧑💻 Note: This project is a collaborative effort, combining the expertise and contributions from Facebook’s LLaMA, Stanford Alpaca, alpaca-lora, and
llama.cpp by various developers, showcasing the power of open-source collaboration.
- GitHub: https://github.com/antimatter15/alpaca.cpp
- Original Alpaca: https://crfm.stanford.edu/2023/03/13/alpaca.html
- Stanford GitHub: https://github.com/tatsu-lab/stanford_alpaca
Model 4: Falcon-40B-Instruct (Not Falcon-180B, Yet!)
The Falcon-40B-Instruct, masterfully crafted by TII, is not just a technological marvel with its impressive 40 billion parameters but also a beacon of privacy and security. As a causal decoder-only model, it’s fine-tuned on a mixture of Baize and stands as a testament to the potential of local processing.
Running the Falcon-40B locally ensures that user data never leaves the device, thereby significantly enhancing user privacy and data security. This local processing capability, combined with its top-tier performance that surpasses other models like LLaMA and StableLM, makes it a prime choice for those who prioritize both efficiency and confidentiality.
- For those who are privacy-conscious and looking to delve into chat or instruction-based tasks, Falcon-40B-Instruct is a perfect fit.
- While it’s optimized for chat/instruction tasks, you might consider the base Falcon-40B model if you want to do further fine-tuning.
- And if you have significant computational constraints (e.g., on a Raspberry Pi) but still wanting to maintain data privacy, the Falcon-7B offers a compact yet secure alternative.
The integration with the transformers library ensures not only ease of use but also a secure environment for text generation, keeping user interactions confidential. Users can confidently utilize Falcon-40B-Instruct, knowing their data remains private and shielded from potential external threats.
So to summarize, you can choose among those three options, ordered by performance and overhead (low to high):
- Falcon-7B (Small Overhead)
- Falcon-40B (Medium Overhead) ⭐ Recommended
- Falcon-180B (High Overhead, cannot yet run locally –> No Privacy!)
You can currently try the Falcon-180B Demo here — it’s fun!
Model 5: Vicuna
What sets Vicuna apart is its ability to write code even though it is very concise and can run on your single-GPU machine (GitHub), which is less common in other open-source LLM chatbots 💻. This unique feature, along with its more than 90% quality rate, makes it stand out among ChatGPT alternatives.
💡 Reference: Original Website
Don’t worry about compatibility, as Vicuna is available for use on your local machine or with cloud services like Microsoft’s Azure, ensuring you can access and collaborate on your writing projects wherever you are.
With Vicuna, you can expect the AI chatbot to deliver text completion tasks such as poetry, stories, and other content similar to what you would find on ChatGPT or Youchat. Thanks to its user-friendly interface and robust feature set, you’ll likely find this open-source alternative quite valuable.
Model 6: h2oGPT
h2oGPT is an open-source generative AI framework building on many models discussed before (e.g., Llama 2) that provides you a user-friendly way to run your own LLMs while preserving data ownership. Thus, it’s privacy friendly and more secure than most solutions on the market.
H2o.ai, like most other organizations in the space, is a for-profit organization so let’s see how it develops during the next couple of years. For now, it’s a fun little helper tool and it’s free and open-source!
5 Common Security and Privacy Risks with LLMs
⚡ Risk #1: Firstly, there’s the enigma of Dark Data Misuse & Discovery.
Imagine LLMs as voracious readers, consuming every piece of information they come across. This includes the mysterious dark data lurking in files, emails, and forgotten database corners. The danger? Exposing private data, intellectual property from former employees, and even the company’s deepest secrets. The shadows of dark Personal Identifiable Information (PII) can cast long-lasting financial and reputational scars. What’s more, LLMs have the uncanny ability to connect the dots between dark data and public information, opening the floodgates for potential breaches and leaks. And if that wasn’t enough, the murky waters of data poisoning and biases can arise, especially when businesses are in the dark about the data feeding their LLMs.
⚡ Risk #2: Next, we encounter the specter of Biased Outputs.
LLMs, for all their intelligence, can sometimes wear tinted glasses. Especially in areas that tread on thin ice like hiring practices, customer service, and healthcare. The culprit often lies in the training data. If the data leans heavily towards a particular race, gender, or any other category, the LLM might inadvertently tilt that way too. And if you’re sourcing your LLM from a third party, you’re essentially navigating blindfolded, unaware of any lurking biases.
⚡ Risk #3: It gets even murkier with Explainability & Observability Challenges.
Think of public LLMs as magicians with a limited set of tricks. Tracing their outputs back to the original inputs can be like trying to figure out how the rabbit got into the hat. Some LLMs even have a penchant for fiction, inventing sources and making observability a Herculean task. However, there’s a silver lining for custom LLMs. If businesses play their cards right, they can weave in observability threads during the training phase.
⚡ Risk #4: But the plot thickens with Privacy Rights & Auto-Inferences.
As LLMs sift through data, they’re like detectives connecting the dots, often inferring personal details from seemingly unrelated data points. Businesses, therefore, walk a tightrope, ensuring they have the green light to make these Sherlock-esque deductions. And with the ever-evolving landscape of privacy rights, keeping track is not just a Herculean task but a Sisyphean one.
⚡ Risk #5: Lastly, we arrive at the conundrum of Unclear Data Stewardships.
In the current scenario, asking LLMs to “unlearn” data is like asking the sea to give back its water. This makes data management a puzzle, with every piece of sensitive data adding to a business’s legal baggage. The beacon of hope? Empowering security teams to classify, automate, and filter data, ensuring that every piece of information has a clear purpose and scope.
🧑💻 Recommended: 30 Creative AutoGPT Use Cases to Make Money Online
Prompt Engineering with Python and OpenAI
You can check out the whole course on OpenAI Prompt Engineering using Python on the Finxter academy. We cover topics such as:
- Semantic search
- Web scraping
- Query embeddings
- Movie recommendation
- Sentiment analysis
👨💻 Academy: Prompt Engineering with Python and OpenAI
While working as a researcher in distributed systems, Dr. Christian Mayer found his love for teaching computer science students.
To help students reach higher levels of Python success, he founded the programming education website Finxter.com that has taught exponential skills to millions of coders worldwide. He’s the author of the best-selling programming books Python One-Liners (NoStarch 2020), The Art of Clean Code (NoStarch 2022), and The Book of Dash (NoStarch 2022). Chris also coauthored the Coffee Break Python series of self-published books. He’s a computer science enthusiast, freelancer, and owner of one of the top 10 largest Python blogs worldwide.
His passions are writing, reading, and coding. But his greatest passion is to serve aspiring coders through Finxter and help them to boost their skills. You can join his free email academy here.