Local
@sails-ai/local connects sails-ai to Ollama, which runs open-source LLMs directly on your machine. No API keys, no usage costs, no data leaving your computer — perfect for development and prototyping.
Why use the local adapter?
- Free — no API costs during development. Run as many requests as you want.
- Private — your prompts and data never leave your machine.
- Offline — works without an internet connection once models are pulled.
- Fast iteration — no network latency, no rate limits.
- Model variety — access to thousands of open-source models: Llama, Qwen, Mistral, Gemma, Phi, and more.
Prerequisites
Install and start Ollama:
# macOS
brew install ollama
# Or download from https://ollama.com
# Start the Ollama server
ollama servePull a model to get started:
# Small and fast — good for development
ollama pull qwen2.5:1.5b
# Llama 3.2 3B — good balance of quality and speed
ollama pull llama3.2:3b
# Llama 3.1 8B — higher quality, needs more RAM
ollama pull llama3.1:8bTIP
Ollama downloads models on first pull. After that, they're cached locally and start instantly.
Installation
npm i @sails-ai/localQuick start
// config/ai.js
module.exports.ai = {
provider: 'local',
providers: {
local: {
adapter: '@sails-ai/local',
baseUrl: process.env.OLLAMA_BASE_URL || 'http://localhost:11434'
}
}
}When Sails lifts, you'll see:
info: @sails-ai/local: Connected to Ollama (3 models available)If Ollama isn't running, you'll see a warning instead — the app still lifts, but chat requests will fail until you start Ollama:
warn: @sails-ai/local: Could not reach Ollama at http://localhost:11434.
Chat will fail until Ollama is running. Start it with: ollama serveUsage
Chat
// Simple string — uses the default model
const reply = await sails.ai.chat('What is the capital of France?')
console.log(reply.content)
// With a specific model
const reply = await sails.ai.chat({
prompt: 'Explain quantum computing in simple terms',
model: 'llama3.2:3b'
})
// Full conversation with system prompt
const reply = await sails.ai.chat({
messages: [
{ role: 'user', content: 'What is risotto?' },
{ role: 'assistant', content: 'Risotto is a creamy Italian rice dish...' },
{ role: 'user', content: 'How do I make it?' }
],
system: 'You are a helpful cooking assistant.',
model: 'llama3.1:8b'
})Streaming
const stream = sails.ai.stream({
prompt: 'Tell me a short story about a clever fox',
model: 'llama3.2:3b'
})
let fullText = ''
for await (const chunk of stream) {
if (chunk.text) {
fullText += chunk.text
process.stdout.write(chunk.text)
}
if (chunk.done) {
console.log('\nModel:', chunk.model)
}
}Streaming over WebSockets
A common pattern during development — stream AI responses to the browser in real time:
// api/controllers/chat/send.js
module.exports = {
inputs: {
content: { type: 'string', required: true }
},
fn: async function ({ content }) {
const stream = sails.ai.stream({
prompt: content,
system: 'You are a helpful assistant.',
model: 'llama3.2:3b'
})
let fullContent = ''
for await (const chunk of stream) {
if (chunk.text) {
fullContent += chunk.text
if (this.req.isSocket) {
sails.sockets.broadcast(sails.sockets.getId(this.req), 'ai:chunk', {
text: chunk.text
})
}
}
}
return { content: fullContent }
}
}Setting a default model
If you don't want to pass model on every call, set a default in your provider config:
// config/ai.js
module.exports.ai = {
provider: 'local',
providers: {
local: {
adapter: '@sails-ai/local',
baseUrl: 'http://localhost:11434',
model: 'llama3.2:3b'
}
}
}Now you can call sails.ai.chat('Hello') without specifying a model. If no default is set and no model is passed, Ollama falls back to qwen2.5:1.5b.
Using with a cloud provider in production
A common setup is using the local adapter during development and switching to a cloud provider like Together AI in production:
// config/ai.js
module.exports.ai = {
provider: process.env.AI_PROVIDER || 'local',
providers: {
local: {
adapter: '@sails-ai/local',
baseUrl: 'http://localhost:11434'
},
together: {
adapter: '@sails-ai/openai',
apiKey: process.env.TOGETHER_API_KEY,
baseURL: 'https://api.together.xyz/v1'
}
}
}# .env (production)
AI_PROVIDER=together
TOGETHER_API_KEY=your-keyYour application code doesn't change — sails.ai.chat() and sails.ai.stream() work the same regardless of provider.
Popular Ollama models
| Model | Pull command | Size | Notes |
|---|---|---|---|
| Qwen 2.5 1.5B | ollama pull qwen2.5:1.5b | ~1 GB | Fast, good for testing |
| Llama 3.2 3B | ollama pull llama3.2:3b | ~2 GB | Good balance of speed and quality |
| Llama 3.1 8B | ollama pull llama3.1:8b | ~4.7 GB | High quality, needs 8GB+ RAM |
| Mistral 7B | ollama pull mistral | ~4.1 GB | Strong reasoning |
| Gemma 2 9B | ollama pull gemma2:9b | ~5.4 GB | Google's open model |
| Qwen 2.5 32B | ollama pull qwen2.5:32b | ~19 GB | Needs 32GB+ RAM |
Browse the full catalog at ollama.com/library.
WARNING
Larger models need more RAM. A 7B model typically needs 8GB, a 13B needs 16GB, and 32B+ needs 32GB or more. If a model is too large for your machine, Ollama will run slowly or fail to load.
Custom Ollama server
If Ollama is running on a different machine or port, set the baseUrl:
// config/ai.js
providers: {
local: {
adapter: '@sails-ai/local',
baseUrl: 'http://192.168.1.100:11434' // Remote machine on your network
}
}Or use an environment variable:
OLLAMA_BASE_URL=http://192.168.1.100:11434Configuration reference
| Option | Type | Default | Description |
|---|---|---|---|
adapter | string | — | Must be '@sails-ai/local' |
baseUrl | string | 'http://localhost:11434' | Ollama server URL |
model | string | 'qwen2.5:1.5b' | Default model to use when none is specified |
Error handling
try {
const reply = await sails.ai.chat('Hello')
} catch (err) {
switch (err.code) {
case 'E_PROVIDER_UNAVAILABLE':
// Ollama is not running
// Start it with: ollama serve
break
case 'E_MODEL_NOT_FOUND':
// Model not pulled yet
// Pull it with: ollama pull <model>
break
case 'E_PROVIDER_ERROR':
// Something else went wrong with Ollama
break
}
}| Code | When | Fix |
|---|---|---|
E_PROVIDER_UNAVAILABLE | Can't connect to Ollama | Run ollama serve |
E_MODEL_NOT_FOUND | Model not installed | Run ollama pull <model> |
E_PROVIDER_ERROR | Ollama returned an error | Check Ollama logs |