Open Source LLMs That Can Be Run Locally

Open-Source LLMs for Local Use:

Llama 2 A family of models from Meta with varying sizes (7B to 70B parameters), known for good performance.
Mistral 7B A relatively small but powerful model from Mistral.
Mixtral 8x7B A "mixture-of-experts" model from Mistral AI, considered very powerful.
Falcon Models from the Technology Innovation Institute in Abu Dhabi, available in different sizes.
GPT-NeoX A 20 billion parameter model from EleutherAI.
OPT Meta's Open Pre-trained Transformer models, ranging from 125M to 175B parameters.
BLOOM A multilingual model developed by BigScience, with 176 billion parameters.
BERT Google's Bidirectional Encoder Representations from Transformers, a popular model for various NLP tasks.

Tools for Running LLMs Locally:

To actually run these models locally, you'll need a tool that supports local LLM inference. Some popular choices include:

Ollama A tool that bundles model weights and configurations for easy local deployment.
LM Studio A GUI-based tool for discovering, downloading, and running local LLMs.
GPT4All An open-source ecosystem for training and deploying local LLMs.
- Link to GitHub repository
llama.cpp A C++ implementation of the Llama model, known for its efficiency.
- Link to GitHub repository

Important Considerations:

Hardware Requirements: Running LLMs locally can be demanding on hardware. Consider your GPU, CPU, RAM, and storage capabilities.

Model Size and Performance: Larger models generally offer better performance but require more resources.