Introduction
As AI technology continues to evolve, deploying large language models (LLMs) like Meta's Llama 3, Google's Gemma, and Mistral on local systems provides unmatched benefits in terms of data privacy and customization. Self-hosting these models and enabling secure online access unlocks even greater possibilities, whether for developers showcasing prototypes, researchers collaborating remotely, or businesses integrating AI into customer-facing applications.
This guide offers detailed instructions on securely sharing Ollama’s API and Open WebUI online using Pinggy, a straightforward tunneling service. Discover how to effortlessly make your local AI environment accessible globally without relying on cloud infrastructure or dealing with complicated configurations.
Summary of the Steps:
Install Ollama & Download a Model:
- Get Ollama from ollama.com and run a model:
ollama run llama3:8b
Deploy Open WebUI
- Run via Docker
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main
Expose WebUI Online
- Tunnel port
3000
:
- Tunnel port
ssh -p 443 -R0:localhost:3000 a.pinggy.io
Share the generated URL for ChatGPT-like access to your LLMs.
Why Share Ollama API and Open WebUI Online?
The Rise of Local AI Deployments:
Due to growing concerns about data privacy and API expenses, running LLMs locally using tools like Ollama and Open WebUI has become a popular choice. However, keeping access limited to your local network restricts their usability. Sharing these tools online enables:
AI integration into web and mobile applications.
Project demonstrations without cloud deployment.
Lower latency while keeping inference local.
Why Use Pinggy for Tunneling?
Pinggy simplifies the process of port forwarding by providing secure tunnels. Its standout features include:
Free HTTPS URLs without requiring signup.
No rate limitations on the free plan.
SSH-based encrypted connections for enhanced security.
Prerequisites for Sharing Ollama and Open WebUI
A. Install Ollama
Download and install Ollama based on your operating system:
Windows: Run the
.exe
installer.macOS/Linux: Execute:
curl -fsSL https://ollama.com/install.sh | sh
Verify the installation:
ollama --version
B. Download a Model
Ollama supports a wide range of models. Start with a lightweight one:
ollama run qwen:0.5b
For multimodal models:
ollama run llava:13b
C. Install Open WebUI
Open WebUI offers a ChatGPT-like interface for Ollama. Install it via Docker:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Access the interface at http://localhost:3000
and set up an admin account.
Sharing Ollama API Online: Detailed Steps
Start Ollama Locally
By default, Ollama runs on port
11434
. Launch the server:ollama serve
Create a Public URL with Pinggy
Run this SSH command to tunnel the Ollama API:
ssh -p 443 -R0:localhost:11434 -t qr@a.pinggy.io "u:Host:localhost:11434"
After executing, you will receive a URL such as
https://abc123.pinggy.link
.Verify API Access
Test the shared API using curl:
curl https://abc123.pinggy.link/api/tags
Alternatively, use a browser to verify access.
Sharing Open WebUI Online: Step-by-Step
Expose Open WebUI via Pinggy
To share port
3000
, execute:ssh -p 443 -R0:localhost:3000 a.pinggy.io
You will receive a unique URL, such as
https://xyz456.pinggy.link
.Access WebUI Remotely
Open the provided URL in a browser.
Log in using your Open WebUI credentials.
Utilize features such as:
Chatting with various models.
Uploading documents for Retrieval-Augmented Generation (RAG).
Switching between different models.
Advanced Security and Optimization Tips
Enhance Security
Add basic authentication to your Pinggy tunnel by appending
username/password credentials:ssh -p 443 -R0:localhost:3000 user:pass@a.pinggy.io
Utilize Custom Domains
Upgrade to Pinggy Pro to configure
custom domains:
ssh -p 443 -R0:localhost:3000 -T yourdomain.com@a.pinggy.io
Real-World Applications for Remote AI Access
Collaborative Development
Facilitate collaborative code reviews and documentation creation by sharing an Ollama instance. Train custom models together using Open WebUI.
Customer-Facing Applications
Enhance customer support with AI-driven chatbots—Automate content creation for blogs and social media.
Academic and Research Projects
Securely share proprietary models with research collaborators to advance academic pursuits.
Troubleshooting Common Issues
Connection Refused:
Ensure Ollama is running withollama serve
, and check firewall settings for ports11434
and3000
.Model Loading Failures:
Verify model compatibility with your Ollama version and free up system memory for larger models likellama3:70b
.
Conclusion
Integrating Ollama, Open WebUI, and Pinggy allows you to turn your local AI environment into a secure and shareable platform without relying on cloud services. This setup is ideal for startups, researchers, and anyone focused on data privacy and performance.
This guide offers step-by-step instructions to securely expose Ollama’s API and Open WebUI online using Pinggy, enabling global accessibility for your AI setup without complex configurations or cloud dependency.