AI Provider Configuration¶
Configure language model providers to power MAESTRO's research and writing capabilities.
Overview¶
MAESTRO supports multiple AI providers and allows flexible configuration:
- Advanced Mode: Configure separate providers and credentials for each model type
- Custom Provider: Connect to any OpenAI-compatible API endpoint
- Local LLMs: Connect to self-hosted models via custom endpoints
Supported Providers¶
Custom Provider (Recommended)¶
Connect to any OpenAI-compatible API endpoint. This is the most flexible option.
Configuration:
- Provider: Select "Custom Provider" from dropdown
- API Key: Enter your API key (or leave blank for local models)
- Base URL: Your endpoint URL
Common Endpoints:
- OpenRouter:
https://openrouter.ai/api/v1/
- OpenAI:
https://api.openai.com/v1/
- Local Ollama:
http://host.docker.internal:11434/v1/
or check your ollama config - Local vLLM:
http://host.docker.internal:8000/v1/
or check your vllm config
OpenRouter¶
Access to 100+ models through a unified API.
Setup:
- Select "Openrouter" as the AI Provider
- API Key: Get from OpenRouter Dashboard
- Base URL:
https://openrouter.ai/api/v1/
- Click "Test" to verify and load models
Available Models:
- Claude models (Anthropic)
- GPT models (OpenAI)
- Llama models (Meta)
- Mistral models
- Many more open and commercial models
Pricing: Pay-per-token, varies by model. Check OpenRouter Pricing
OpenAI¶
Direct access to OpenAI's GPT models.
Setup:
- Select "OpenAI" as the AI Provider
- API Key: Get from OpenAI Platform
- Base URL:
https://api.openai.com/v1/
- Click "Test" to verify and load models
Available Models:
- gpt-5-chat
- gpt-5-mini
- gpt-5-nano
- gpt-4o
- gpt-4o-mini
Pricing: Check OpenAI Pricing
Configuration Mode¶
Advanced Configuration¶
MAESTRO uses Advanced Mode to configure separate providers and credentials for each model type:
- Check "Advanced Configuration" checkbox
- For each model type (Fast, Mid, Intelligent, Verifier):
- Select "Custom Provider" from dropdown
- Enter API key (if required)
- Enter base URL for your provider
- Click "Test" to load available models
- Select model from dropdown
- Click "Save & Close" to apply settings
Example Setup:
-
Fast Model:
- Provider: Custom Provider
- Base URL:
https://openrouter.ai/api/v1/
- Model:
meta-llama/llama-3.2-3b-instruct
-
Mid Model:
- Provider: Custom Provider
- Base URL:
https://openrouter.ai/api/v1/
- Model:
anthropic/claude-3.5-haiku
-
Intelligent Model:
- Provider: Custom Provider
- Base URL:
https://api.openai.com/v1/
- Model:
gpt-5-chat-latest
-
Verifier Model:
- Provider: Custom Provider
- Base URL:
http://host.docker.internal:11434/v1/
- Model:
llama3.2
(local)
Model Types and Agent Usage¶
MAESTRO uses four model categories, automatically assigned to different agents and tasks:
Fast Model¶
Agents using Fast Model:
- Planning Agent - Creates research plans and outlines
- Note Assignment Agent - Distributes information to sections
- Query Strategy Agent - Determines search strategies
- Router Agent - Routes tasks to appropriate agents
Use Cases: Quick decisions, simple formatting, routing logic
Recommended Models: Smaller, faster models (GPT-4o-mini, Claude Haiku, GPT-5-nano)
Mid Model (Default)¶
Agents using Mid Model:
- Research Agent - Main research and information gathering
- Writing Agent - Document composition
- Simplified Writing Agent - Streamlined writing tasks
- Messenger Agent - User interaction and messaging
- Default fallback - Any undefined agent modes
Use Cases: General research, standard writing, analysis, user interaction
Recommended Models: Balanced models (GPT-5-mini, Claude-4-Sonnet, Qwen-2.5-72b-instruct)
Intelligent Model¶
Agents using Intelligent Model:
- Reflection Agent - Critical analysis and feedback
- Query Preparation Agent - Complex query reformulation
- Research Agent (critical tasks) - When explicitly needed for complex analysis
Use Cases: Deep analysis, complex reasoning, quality assessment
Recommended Models: Most capable models (GPT-5-chat, Claude-4.1-Opus, Qwen3-235b-a22b-2507)
Verifier Model¶
Agents using Verifier Model:
- Verification tasks - Fact-checking and validation
- Quality control - Ensuring accuracy of information
Use Cases: Fact verification, consistency checking
Recommended Models: Accurate, reliable models (typically same as Intelligent)
Local LLM Setup¶
Using Ollama¶
-
Install and run Ollama:
-
Configure in MAESTRO:
- Provider: Custom Provider
- API Key: (use dummy unless you configured authentication)
- Base URL:
http://host.docker.internal:11434/v1/
- Click "Test" to load available models
- Model: Select from dropdown (e.g., "llama3.2", "mistral")
Using LM Studio¶
- Start LM Studio server on port 1234
- Configure in MAESTRO:
- Provider: Custom Provider
- API Key: (use dummy unless you configured authentication)
- Base URL:
http://host.docker.internal:1234/v1/
- Click "Test" to verify connection
Using vLLM¶
-
Start vLLM server:
python -m vllm.entrypoints.openai.api_server \ --model "/home/user/models/Qwen_Qwen3-32B-AWQ" \ --tensor-parallel-size 2 \ --port 5000 \ --host 0.0.0.0 \ --gpu-memory-utilization 0.90 \ --served-model-name "localmodel" \ --disable-log-requests \ --disable-custom-all-reduce \ --enable-prefix-caching \ --guided-decoding-backend "xgrammar" \ --chat-template /home/user/vllm/qwen3_nonthinking.jinja
-
Configure in MAESTRO:
- Provider: Custom Provider
- API Key: (use dummy unless you configured authentication)
- Base URL:
http://192.168.xxx.xxx:5000/v1/
- Click "Test" to load available model (will show as localmodel)
Testing Your Configuration¶
Connection Test¶
- Enter your API credentials (API Key and Base URL)
- Click the "Test" button next to the API Key field
- Wait for verification (this fetches available models)
- Success will populate the model dropdowns
Model Selection¶
After successful connection test:
- Fast Model: Select from dropdown for rapid, simple tasks
- Mid Model: Select for balanced performance tasks
- Intelligent Model: Select for complex analysis (defaults to same as Mid if not set)
- Verifier Model: Select for fact-checking (defaults to same as Mid if not set)
Saving Configuration¶
- After selecting all models, click "Save & Close"
- Settings are saved per user
- Changes take effect immediately for new research sessions
Troubleshooting Connection Issues¶
"Connection failed" error:
- Verify API key is correct and active
- Check internet connection
- Ensure billing is set up with provider
- For custom endpoints, verify server is running
"Models not loading":
- Wait a moment and try refreshing
- Check if API key has correct permissions
- Verify base URL format (should end with
/v1/
for OpenAI-compatible APIs)
Best Practices¶
Model Selection¶
-
Match complexity to task
- Don't use GPT-5-chat for simple formatting
- Don't use GPT-5-nano for complex analysis
-
Consider cost
- Fast models for high-volume tasks
- Intelligent models for critical analysis
-
Test different combinations
- Find the optimal balance for your use case
Performance Optimization¶
-
Use local models for:
- Sensitive data
- High-volume processing
- Offline operation
-
Use cloud models for:
- Best quality
- Latest capabilities
- No infrastructure management
Cost Management¶
Monitoring Usage¶
- Track usage through provider dashboards
- Monitor job costs in UI
- Set up billing alerts
- Monitor token consumption in MAESTRO logs
Common Configurations¶
Budget-Conscious Setup¶
- All models: OpenRouter with open models (Mistral, Llama)
- Cost: Very low
Quality-Focused Setup¶
- Fast:
GPT-5-nano
orGPT-4o-mini
orClaude Haiku
- Mid:
GPT-5-mini
orClaude Sonnet
- Intelligent:
GPT-5-chat
orClaude Opus
- Verifier: same as Intelligent
Privacy-Focused Setup¶
- All models: Local LLMs via VLLM or SGLang with structured generation
- No external API calls
Hybrid Setup¶
- Fast/Mid: Local models
- Intelligent: Cloud model for complex tasks
- Balance of privacy and capability
Provider Comparison¶
Provider | Pros | Cons | Best For |
---|---|---|---|
OpenRouter | 100+ models, unified billing | Adds small overhead | Flexibility |
OpenAI | Direct access, latest models | Single vendor | GPT users |
Local LLMs | Privacy, no costs | Requires hardware | Sensitive data |
Troubleshooting¶
API Key Issues¶
Invalid API key:
- Double-check key from provider dashboard
- Ensure no extra spaces
- Verify key is active
Rate limiting:
- Check provider rate limits
- Implement retry logic
- Consider upgrading plan
Model Selection Issues¶
Model not available:
- Verify model name spelling
- Check if model is available in your region
- Ensure API key has access to model
Wrong model behavior:
- Verify correct model selected
- Check model parameters
- Test with different model
Next Steps¶
- Search Provider Configuration - Set up web search
- Environment Variables - System configuration
- First Login - Initial setup guide