Using Ollama on Hoffman2

Creation date: 12/19/2024 4:11 PM    Updated: 12/19/2024 4:11 PM

Ollama is a tool for running and managing Large Language Models (LLMs). On Hoffman2, you can run Ollama via an Apptainer container that has been prepared by the system administrators. Ollama provides a command-line interface and can also run a web-based interface (Open WebUI) for interacting with LLMs through a browser. The prepared Ollama container can run Open WebUI interface using the Ollama service. All locally from Hoffman2. This is a great option to run LLM models locally on Hoffman2, without needing to use a commercial/enterprise API license that computes the LLM on their servers.


Getting Started

1. Allocate a GPU node

To use Ollama, you will need access to a GPU node. While Ollama can run on CPU nodes, GPU nodes will provide significantly better performance. This command will give you access to one A100 GPU

qrsh -l h_data=10G,gpu,A100,cuda=1


2. Load the Apptainer module


module load apptainer

Apptainer is the software that is used to run containerize software. Ollama (and Open WebUI) that is used here is in a container that will be ran with Apptainer.

3. Start the Ollama container

The prepared container is located at $H2_CONTAINER_LOC/h2-ollama.sif. Starting the container as an “instance” will also start the Ollama server in the background.

apptainer instance run --nv $H2_CONTAINER_LOC/h2-ollama.sif myollama

Here, myollama is the name you assign to the instance. You can choose any name.

4. Check the running instance


apptainer instance list

This will show all running instances. In this case, the instance used was 'myollama' and it should be output here.

5. View instance logs


apptainer instance list --logs

This will show you the locations of the stdout and stderr logs for debugging if needed. These logs can give you information on the ports running the ollama services and other information you may need.

Using Ollama

Once the instance is running, you can execute Ollama commands inside the container using apptainer exec.
- Pull a model (e.g., llama3.2):

apptainer exec instance://myollama ollama pull llama3.2


- Pull a model from Hugging Face:
apptainer exec instance://myollama ollama pull hf.co/lmstudio-community/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF

- List downloaded models:

apptainer exec instance://myollama ollama list


- Start a chat session with a model:

apptainer exec instance://myollama ollama run llama3.2


- Run a single prompt (e.g., summaring a file):
Suppose you have a file text.txt. You can have ollama summarize its contents as follows:

apptainer exec instance://myollama ollama run llama3.2 "Summarize this file: $(cat test.txt)"


All in all, you can run all ollama command by adding "apptainer exec instance://myollama" in front of the ollama command

apptainer exec instance://myollama ollama [command]



Stopping the Ollama Instance

When you are done, stop the Ollama instance:


apptainer instance stop myollama



Model Storage

By default, models are stored in $HOME/ollama_models. Once pulled, they remain there even after you stop the Ollama instance. To change this directory, set the OLLAMA_MODELS environment variable before starting the container instance:


export OLLAMA_MODELS=$SCRATCH/ollama_models



Configuring Ports

Ollama uses port 11434 by default. If 11434 is unavailable, the Ollama container will try subsequent ports until it finds an open one. To specify a port manually, set the ollama_port environment variable before starting the instance:



export ollama_port=11434




Using the Open Web UI Interface

To use Ollama with an Open Web UI, start the instance with the openwebui option:


apptainer instance run --nv $H2_CONTAINER_LOC/h2-ollama.sif myollama openwebui


By default, the Open Web UI listens on port 8081, but it will try other ports if 8081 is not available. Check the instance logs to see the exact port used.


SSH Port Forwading

To access the Open WebUI from your local machine, you will need your local machine access to the port on the compute node running Open WebUI. The best way to do this is with SSH tunneling:

1. Set up a new terminal new with SSH port forwarding to the compute node:


ssh -L PORT:node_name:PORT username@hoffman2.idre.ucla.edu


Replace 'PORT' with the 'port' running Open Web UI, node_name with the compute node running the Open Web UI. You may want to run the command 'hostname' on the Hoffman2 compute node to get the node_name.

For example:


ssh -L 8081:gX:8081 username@hoffman2.idre.ucla.edu


2. Once this ssh is successfully, then on your local machine, open a web browser and go to:


http://localhost:PORT


where 'PORT' is the port number running Open WebUI. The page loaded will by your running Open WebUI!

The first time you connect, you will need to create a username/email and password. These credentials will be stored in your $HOME/webui/data directory on Hoffman2. Once set, no additional accounts can be created by default.


Changing the Data Directory

By default, Open Web UI stores its data at $HOME/webui/data. You can change this by setting the DATA_DIR environment variable before starting the container:


export DATA_DIR=$SCRATCH/webui/data



Changing the Open Web UI port

Similarly, you can change the default Open Web UI port before starting the container:


export webui_port=8081


If the port is unavailable, the UI will try other ports automatically.

You will want to check the apptainer instance logs to see the exact port numbers used.


Creating a custom Ollama container

The Ollama container provided on Hoffman2 was built using Apptainer by the Hoffman2 staff. You can also create your own custom Ollama container if you require additional software or customization. The definition file used to create the Hoffman2 Ollama container is available on our HPC github page

- Download this definition file

- Customize the definition file as needed to install additional packages or modify the environment.

- Build your new container:


apptainer build my-new-ollama.sif h2-ollama.def


his will create a new container named my-new-ollama.sif. You can then use this new container with the same commands described above, simply replacing $H2_CONTAINER_LOC/h2-ollama.sif with my-new-ollama.sif.


Summary (TL;DR)

Start Ollama:


apptainer instance run --nv $H2_CONTAINER_LOC/h2-ollama.sif myollama


Pull a Model:


apptainer exec instance://myollama ollama pull llama3.2


Run a Model

apptainer exec instance://myollama ollama run llama3.2

Stop Ollama:


apptainer instance stop myollama


Start Ollama with Open Web UI:


apptainer instance run --nv $H2_CONTAINER_LOC/h2-ollama.sif myollama openwebui


Then, port foward to access the UI:


ssh -L PORT:node_name:PORT username@hoffman2.idre.ucla.edu


Open in browser:


http://localhost:8081



Set environment variables before starting for customization:


export DATA_DIR=$SCRATCH/webui/data

export OLLAMA_MODELS=$HOME/ollama_models

export webui_port=8081

export ollama_port=11434



For any help about the usage of Ollama on Hoffman2, please contact Hoffman2 support.