Selfhost Ollama with Open WebUI Online

Selfhost Ollama with Open WebUI Online

Introduction

As AI technology continues to evolve, deploying large language models (LLMs) like Meta's Llama 3, Google's Gemma, and Mistral on local systems provides unmatched benefits in terms of data privacy and customization. Self-hosting these models and enabling secure online access unlocks even greater possibilities, whether for developers showcasing prototypes, researchers collaborating remotely, or businesses integrating AI into customer-facing applications.

This guide offers detailed instructions on securely sharing Ollama’s API and Open WebUI online using Pinggy, a straightforward tunneling service. Discover how to effortlessly make your local AI environment accessible globally without relying on cloud infrastructure or dealing with complicated configurations.

Summary of the Steps:

  1. Install Ollama & Download a Model:

    ollama run llama3:8b
  1. Deploy Open WebUI

    • Run via Docker
    docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main
  1. Expose WebUI Online

    • Tunnel port 3000:
    ssh -p 443 -R0:localhost:3000 a.pinggy.io

Share the generated URL for ChatGPT-like access to your LLMs.

Why Share Ollama API and Open WebUI Online?

The Rise of Local AI Deployments:

Due to growing concerns about data privacy and API expenses, running LLMs locally using tools like Ollama and Open WebUI has become a popular choice. However, keeping access limited to your local network restricts their usability. Sharing these tools online enables:

  • AI integration into web and mobile applications.

  • Project demonstrations without cloud deployment.

  • Lower latency while keeping inference local.

Why Use Pinggy for Tunneling?

Pinggy simplifies the process of port forwarding by providing secure tunnels. Its standout features include:

  • Free HTTPS URLs without requiring signup.

  • No rate limitations on the free plan.

  • SSH-based encrypted connections for enhanced security.

Prerequisites for Sharing Ollama and Open WebUI

A. Install Ollama

  1. Download and install Ollama based on your operating system:

    • Windows: Run the .exe installer.

    • macOS/Linux: Execute:

     curl -fsSL https://ollama.com/install.sh | sh
  1. Verify the installation:

     ollama --version
    

B. Download a Model

Ollama supports a wide range of models. Start with a lightweight one:

ollama run qwen:0.5b

For multimodal models:

ollama run llava:13b

C. Install Open WebUI

Open WebUI offers a ChatGPT-like interface for Ollama. Install it via Docker:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Access the interface at http://localhost:3000 and set up an admin account.

Installed

command

Sharing Ollama API Online: Detailed Steps

  1. Start Ollama Locally

    By default, Ollama runs on port 11434. Launch the server:

     ollama serve
    
  2. Create a Public URL with Pinggy

    Run this SSH command to tunnel the Ollama API:

     ssh -p 443 -R0:localhost:11434 -t qr@a.pinggy.io "u:Host:localhost:11434"
    

    After executing, you will receive a URL such as https://abc123.pinggy.link.

  3. Verify API Access

    Test the shared API using curl:

     curl https://abc123.pinggy.link/api/tags
    

Alternatively, use a browser to verify access.

Running WebUI

Sharing Open WebUI Online: Step-by-Step

  1. Expose Open WebUI via Pinggy

    To share port 3000, execute:

     ssh -p 443 -R0:localhost:3000 a.pinggy.io
    

    You will receive a unique URL, such as
    https://xyz456.pinggy.link.

  2. Access WebUI Remotely

    1. Open the provided URL in a browser.

    2. Log in using your Open WebUI credentials.

    3. Utilize features such as:

      • Chatting with various models.

      • Uploading documents for Retrieval-Augmented Generation (RAG).

      • Switching between different models.

WebUI

Setp1

Advanced Security and Optimization Tips

  1. Enhance Security

    Add basic authentication to your Pinggy tunnel by appending
    username/password credentials:

     ssh -p 443 -R0:localhost:3000 user:pass@a.pinggy.io
    
  2. Utilize Custom Domains

Upgrade to Pinggy Pro to configure
custom domains:

   ssh -p 443 -R0:localhost:3000 -T yourdomain.com@a.pinggy.io

Real-World Applications for Remote AI Access

Collaborative Development
Facilitate collaborative code reviews and documentation creation by sharing an Ollama instance. Train custom models together using Open WebUI.

Customer-Facing Applications
Enhance customer support with AI-driven chatbots—Automate content creation for blogs and social media.

Academic and Research Projects
Securely share proprietary models with research collaborators to advance academic pursuits.

Troubleshooting Common Issues

  • Connection Refused:
    Ensure Ollama is running with ollama serve, and check firewall settings for ports 11434 and 3000.

  • Model Loading Failures:
    Verify model compatibility with your Ollama version and free up system memory for larger models like llama3:70b.

Conclusion
Integrating Ollama, Open WebUI, and Pinggy allows you to turn your local AI environment into a secure and shareable platform without relying on cloud services. This setup is ideal for startups, researchers, and anyone focused on data privacy and performance.

This guide offers step-by-step instructions to securely expose Ollama’s API and Open WebUI online using Pinggy, enabling global accessibility for your AI setup without complex configurations or cloud dependency.