AI Model Troubleshooting¶

Quick fixes for AI model configuration and API issues.

Configuration Issues¶

Solution:

Go to Settings → AI Config
Select provider (OpenAI, OpenRouter, or Custom Provider)
Enter API key
Click "Test" button
Models should populate automatically

AI Config Model Selection

If still empty:

# Check logs for errors
docker compose logs maestro-backend | grep -i "api\|model"

# Restart backend
docker compose restart maestro-backend

Wrong Model Being Used¶

Check current configuration:

Go to Settings → AI Config
Verify each model type is set correctly:
Fast Model: For quick tasks
Mid Model: For balanced performance
Intelligent Model: For complex analysis
Verifier Model: For verification

API Issues¶

Authentication Failed¶

Error: Invalid API key

Solution:

# Test OpenAI key
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

# Test OpenRouter key
curl https://openrouter.ai/api/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Rate Limit Errors¶

Error: Rate limit exceeded

Solution:

# Configure retry settings in .env
MAX_RETRIES=3
RETRY_DELAY=5
MAX_CONCURRENT_REQUESTS=2

# Restart
docker compose restart maestro-backend

Context Too Large¶

Error: Context length exceeded

Solution in Settings → Research:

Reduce writing_agent_max_context_chars (e.g., 100,000)
Reduce main_research_doc_results (e.g., 3)
Reduce main_research_web_results (e.g., 3)

Provider-Specific Issues¶

OpenAI¶

Common issues:

Wrong API key format
Insufficient credits
Model not available in your region

Check account:

# Check usage
curl https://api.openai.com/v1/usage \
  -H "Authorization: Bearer YOUR_KEY"

OpenRouter¶

Model not found:

# Use full model path
# Correct: anthropic/claude-3-sonnet
# Wrong: claude-3-sonnet

# Check available models
curl https://openrouter.ai/api/v1/models \
  -H "Authorization: Bearer YOUR_KEY"

Custom Provider¶

Connection failed:

# Test endpoint
curl YOUR_CUSTOM_BASE_URL/v1/models

# Common base URLs:
# Local vLLM: http://localhost:5000/v1
# Local SGLang: http://localhost:30000/v1
# Local Ollama: http://localhost:11434/v1
# LM-Studio: http://localhost:1234/v1

Performance Issues¶

Slow Response¶

Quick fixes:

Use faster models (e.g., gpt-4o-mini)
Reduce context sizes in Research settings
Check network latency to API

High Costs¶

Reduce costs:

Use cheaper models for Fast/Mid tiers
Reduce research parameters:
- initial_research_max_questions
- structured_research_rounds
- writing_passes

Cost Tracking Discrepancies¶

Why Don't My Tracked Costs Match My API Provider Dashboard?¶

This is a known issue with some API providers' pricing and billing. MAESTRO correctly tracks costs based on providers' advertised pricing, but actual charges often differ significantly.

The Problem¶

Some API providers, particularly aggregators/routers, have inconsistent billing:

API aggregators route to different providers - Each backend provider may have different actual costs
Dynamic routing affects pricing - Aggregators choose providers based on availability and latency, not just price
API usage.cost field may be unreliable - Sometimes returns values ~100x lower than actual charges
Dashboard charges don't match advertised pricing - Can be 0.4x to 4x the calculated cost

Real Example¶

Here's actual data from testing with a popular model:

Prompt Tokens	Completion	Our Calculation	API usage.cost	Dashboard Charge	Variance
4,405	200	$0.000601	$0.000006	$0.000594	0.99x
349	300	$0.000275	$0.000003	$0.000272	0.99x
52	1,000	$0.000805	$0.000003	$0.000312	0.39x
243	200	$0.000184	$0.000003	$0.000331	1.80x
64	124	$0.000106	$0.000002	$0.000157	1.48x

Key Findings:

Token counts are usually accurate across API and dashboard
MAESTRO's pricing calculation is correct based on advertised rates
Dashboard charges can be inconsistent with advertised rates

Testing Your Own Costs¶

We provide a test script to verify pricing discrepancies:

# Run the pricing test (example for OpenRouter)
python scripts/test_openrouter_pricing.py --api-key YOUR_API_KEY

# Example output will show:
# - Token counts from API
# - Calculated costs based on advertised pricing
# - API reported costs (usually wrong)
# - Comparison with dashboard charges

What This Means for You¶

Your tracked costs may differ from actual charges - typically 40-60% of dashboard values
This is NOT a bug in MAESTRO - we calculate correctly based on advertised prices
Some providers' billing is inconsistent - they may charge differently than advertised

Workarounds¶

Apply a multiplier to displayed costs based on your provider:

# Adjust based on your observed discrepancy
estimated_actual_cost = tracked_cost * 1.5

Monitor your actual provider dashboard for true costs
Use providers with consistent pricing if cost accuracy is critical:
Some providers have more predictable pricing than others
Local models have zero API costs

Technical Details¶

The discrepancy may be caused by:

Aggregator routing: Services like OpenRouter route to different backend providers with varying costs
Dynamic provider selection: Aggregators optimize for availability and latency, not just price
Hidden tokens: Providers may count system prompts or special tokens not reported in API
Different tokenizers: Billing tokenizer may differ from API response tokenizer
Overhead charges: Routing or processing overhead not disclosed
Minimum charges or rounding: Some providers may have minimum charge amounts

Note: Direct providers (like OpenAI, Anthropic) typically have more consistent pricing than aggregators/routers, as they don't route between multiple backends.

For more technical details and provider-specific test scripts, see our scripts directory.

Debugging¶

Enable Debug Logging¶

# In .env
LOG_LEVEL=DEBUG

# Restart
docker compose restart maestro-backend

# Watch logs
docker compose logs -f maestro-backend | grep -i "model\|api"

Test Models Directly¶

# Check configured models
docker exec maestro-backend python -c "
from ai_researcher.dynamic_config import get_fast_model_name, get_mid_model_name, get_intelligent_model_name
print('Fast:', get_fast_model_name())
print('Mid:', get_mid_model_name())
print('Intelligent:', get_intelligent_model_name())
"

Structured Outputs & Provider Compatibility¶

What are Structured Outputs?¶

Maestro uses OpenAI's structured outputs feature (json_schema response format) to ensure LLM responses match exact Pydantic model schemas. However, not all providers support this advanced feature.

Provider Support Status¶

Provider	json_object	json_schema (Structured)	Notes
OpenAI	✅	✅	Full support (gpt-4o 2024-08-06+)
Azure OpenAI	✅	⚠️	Needs model 2024-08-06+, API 2024-10-21+
Anthropic Claude	✅	✅	Full support via different API
Google Gemini	✅	⚠️	Limited support, different implementation
DeepSeek	✅	❌	Only basic json_object mode
Moonshot/Kimi	⚠️	❌	Own incompatible JSON schema format
Local (Ollama)	✅	❌	Most models only support json_object
Local (LM-Studio)	✅	❌	Basic JSON mode only
Local (vLLM)	✅	✅	Full support with `guided_json` parameter (docs)
Local (SGLang)	✅	✅	Full support with grammar constraints (docs)

Automatic Fallback Mechanism¶

Maestro automatically handles incompatible providers:

First attempts structured outputs for maximum reliability
Detects errors and falls back to json_object mode
Enhances prompts with schema instructions in fallback mode
Validates responses against Pydantic models regardless

Common Compatibility Errors¶

Error Message	Provider	Meaning
"This response_format type is unavailable"	DeepSeek	No json_schema support
"Invalid moonshot flavored json schema"	Moonshot/Kimi	Incompatible schema format
"keyword 'default' is not allowed"	Moonshot	Schema validation error
"structured outputs are not supported"	Azure (older)	Need newer model/API

Configuring Local Model Servers¶

vLLM with Structured Outputs¶

# Start vLLM with guided decoding
python -m vllm.entrypoints.openai.api_server \
  --model your-model \
  --guided-decoding-backend outlines \
  --port 5000

# In Maestro Settings → AI Config:
# Provider: Custom Provider
# Base URL: http://localhost:5000/v1
# Model: your-model

SGLang with Grammar Support¶

# Start SGLang server
python -m sglang.launch_server \
  --model-path your-model \
  --port 30000

# In Maestro Settings → AI Config:
# Provider: Custom Provider  
# Base URL: http://localhost:30000/v1
# Model: your-model

For detailed local LLM deployment, see Local LLM Deployment Guide.

Common Error Messages¶

Error	Solution
"API key invalid"	Check key in Settings → AI Config
"Model not found"	Use full model path (provider/model)
"Rate limit exceeded"	Wait or upgrade API plan
"Context length exceeded"	Reduce context in Research settings
"Connection timeout"	Check network/firewall
"Invalid json schema"	Provider doesn't support structured outputs (auto-fallback)
"response_format unavailable"	Provider limitation (auto-fallback)

Quick Fixes¶

Reset AI Configuration¶

# Clear settings and reconfigure
docker exec maestro-postgres psql -U maestro_user -d maestro_db -c "
UPDATE users SET settings = settings - 'ai_endpoints' WHERE username = 'admin';
"

# Then reconfigure in web UI

Switch to Local Models¶

# In Settings → AI Config
# Select "Custom Provider"
# Base URL: http://your-server:5000/v1
# Model: localmodel

Still Having Issues?¶

Check logs: docker compose logs maestro-backend
Verify API keys are valid
Check provider status pages
Try different models

AI Model Troubleshooting¶

Configuration Issues¶

No Models in Dropdown¶

Wrong Model Being Used¶

API Issues¶

Authentication Failed¶

Rate Limit Errors¶

Context Too Large¶

Provider-Specific Issues¶

OpenAI¶

OpenRouter¶

Custom Provider¶

Performance Issues¶

Slow Response¶

High Costs¶

Cost Tracking Discrepancies¶

Why Don't My Tracked Costs Match My API Provider Dashboard?¶

The Problem¶

Real Example¶

Testing Your Own Costs¶

What This Means for You¶

Workarounds¶

Technical Details¶

Debugging¶

Enable Debug Logging¶

Test Models Directly¶

Structured Outputs & Provider Compatibility¶

What are Structured Outputs?¶

Provider Support Status¶

Automatic Fallback Mechanism¶

Common Compatibility Errors¶

Configuring Local Model Servers¶

vLLM with Structured Outputs¶

SGLang with Grammar Support¶

Common Error Messages¶

Quick Fixes¶

Reset AI Configuration¶

Switch to Local Models¶

Still Having Issues?¶