Development & AI | Alper Akgun

Ollama; running llms locally

September, 2023

Ollama allows you to run large language models like llama 2 or code llama, locally. You can use the link https://ollama.ai/download to download and install ollama.

I verified and used the script provided by ollama for installation. The script took quite some time, on my computer, due to the ollam server being too slow.

I went with the manual installation option, where I downloaded the

# installation
sudo curl -L https://ollama.ai/download/ollama-linux-amd64 -o /usr/bin/ollama
sudo chmod +x /usr/bin/ollama

You can get a list of models supported here: https://ollama.ai/library. There, you will see popular models like llama2, codellama, falcon, mistral, phind-*, wizard-* etc.

I have started ollama serving and had to run a model individually

# use this to run ollama without any runner shell
ollama serve

# when you run mistral it downloads the instruct model, and you can chat with it on the command line
ollama run mistral

# or other models
ollama run mistral:text

You may use the following to get text completions

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt":"How many neurons are there in the average mammalian brains?"
 }'