How to run Ollama in Windows via WSL

After completing the steps in CUDA in WSL, you’re ready to install Ollama and run large language models directly in your WSL distro.

Installing Ollama in WSL

Ollama provides a simple installation script. Open your WSL terminal and run:

1
curl https://ollama.ai/install.sh | sh

This will download and install Ollama, setting up everything you need to get started.

Running Your First Model

To verify your installation, try running a lightweight model. Qwen3 is a great choice for quick testing:

1
ollama run qwen3:0.6b

Ollama will automatically download the model and start an interactive session. You can now chat with Qwen3 directly from your terminal.

Verifying GPU Usage in Ollama

To confirm that Ollama is utilizing your GPU (after a successful CUDA installation), run the following command in your WSL terminal:

1
ollama ps

This will display the status of running models and indicate whether the GPU is being used. If not, double-check your CUDA setup and ensure your WSL distro has access to the GPU.

Tips

  • Make sure your WSL distro has access to your GPU for best performance (see the CUDA setup post).
  • Explore other models available on Ollama’s model library.
  • For advanced usage, check out Ollama’s documentation for API integration and custom model management.

Enjoy running LLMs locally in WSL!