CLI Commands Reference¶
MAESTRO provides powerful command-line tools for bulk document processing, user management, and system administration. The CLI features direct processing with real-time progress feedback.
Overview¶
The MAESTRO CLI offers:
- Direct Processing - Documents processed immediately with live feedback
- Real-time Progress - See each step with timestamps
- Multi-format Support - PDF, Word, Markdown files
- GPU Control - Specify which GPU device to use
- Bulk Operations - Process entire directories
- User Management - Create and manage users
- Document Organization - Create and manage document groups
Getting Started¶
Platform-Specific Usage¶
Linux/macOS¶
Windows PowerShell¶
Windows Command Prompt¶
Direct Docker Execution¶
User Management Commands¶
create-user¶
Create a new user account.
Syntax:
Options:
--full-name "Name"
- Set user's full name--admin
- Create admin user
Examples:
# Create regular user
./maestro-cli.sh create-user researcher pass123 --full-name "Research User"
# Create admin user
./maestro-cli.sh create-user admin adminpass --admin --full-name "Administrator"
Document Group Management¶
create-group¶
Create a document group for organization.
Syntax:
Options: - --description "Text"
- Add group description
Example:
./maestro-cli.sh create-group researcher "AI Papers" \
--description "Machine Learning Research Papers"
list-groups¶
List document groups.
# List all groups
./maestro-cli.sh list-groups
# List groups for specific user
./maestro-cli.sh list-groups --user researcher
Output:
Group ID: abc123-def456
Name: AI Papers
Owner: researcher
Documents: 42
Description: Machine Learning Research Papers
Note: Document groups can be created and listed via CLI. Documents can be added to groups during ingestion with the --group
flag. For other group management operations, use the web interface.
Document Processing Commands¶
ingest¶
Process documents with live feedback. Primary command for adding documents.
Syntax:
Options:
--group <group_id>
- Add to specific group--force-reembed
- Force re-processing (default behavior skips already processed files)--device <device>
- GPU device (cuda:0, cpu)--delete-after-success
- Remove source files after processing--batch-size <num>
- Parallel processing count
Supported Formats:
- PDF files (
.pdf
) - Word documents (
.docx
,.doc
) - Markdown files (
.md
,.markdown
)
Mounting Directories for Batch Document Processing¶
The CLI service needs access to your documents. By default, only ./pdfs
is mounted. To process documents from other directories, you need to mount them in docker-compose.yml:
Edit the cli
service in docker-compose.yml:
cli:
# ... existing configuration ...
volumes:
# ... existing volumes ...
- ./pdfs:/app/pdfs # Default PDF directory
- ./documents:/app/documents # Add your custom document directory
- ./research-papers:/app/research-papers # Another example
Then use the mounted path:
./maestro-cli.sh ingest researcher /app/documents
./maestro-cli.sh ingest researcher /app/research-papers
Batch Document Processing Examples¶
Examples:
# Basic document ingestion from default directory
./maestro-cli.sh ingest researcher ./pdfs
# Process documents from custom mounted directory
./maestro-cli.sh ingest researcher ./documents
# Add to group with GPU selection
./maestro-cli.sh ingest researcher ./research-papers \
--group abc123 --device cuda:0
# Process and cleanup temporary files
./maestro-cli.sh ingest researcher ./temp-docs \
--delete-after-success
# Force re-processing of updated documents
./maestro-cli.sh ingest researcher ./updated-docs \
--force-reembed
# Process with custom batch size for large collections
./maestro-cli.sh ingest researcher ./large-collection \
--batch-size 10
# Make sure you have enough VRAM; adjust to lower batch size if you see out of memory errors
Progress Output:
Processing: paper1.pdf
[12:34:56] Converting to markdown...
[12:35:02] Extracting metadata...
[12:35:05] Generating embeddings...
[12:35:12] ✓ Successfully processed: paper1.pdf
Processing: paper2.pdf
[12:35:13] Converting to markdown...
Search Commands¶
search¶
Search documents using semantic search.
Options:
--limit <num>
- Result count (default: 10)--group <group_id>
- Search within group--threshold <float>
- Similarity threshold (0-1)
Examples:
# Basic search
./maestro-cli.sh search admin "quantum computing applications"
# Search with options
./maestro-cli.sh search researcher "machine learning" \
--limit 20 \
--group abc123 \
--threshold 0.7
Note: Metadata search is available through the web interface. The CLI search
command provides semantic search capabilities.
System Management Commands¶
status¶
Check document processing status (not system status).
# Check status for a user
./maestro-cli.sh status --user researcher
# Check status for a group
./maestro-cli.sh status --group <group_id>
Note: For system-wide statistics, use reset-db --stats
command.
cleanup¶
Clean up documents with specific status.
# Clean up failed documents
./maestro-cli.sh cleanup --user researcher --status failed
# Clean up for specific group
./maestro-cli.sh cleanup --group <group_id>
# Skip confirmation
./maestro-cli.sh cleanup --confirm
cleanup-cli¶
Clean up documents stuck in CLI processing.
# Dry run to see what would be deleted
./maestro-cli.sh cleanup-cli --dry-run
# Force cleanup without confirmation
./maestro-cli.sh cleanup-cli --force
reset-db¶
Database reset operations.
# Check database status
./maestro-cli.sh reset-db --stats
# Check consistency
./maestro-cli.sh reset-db --check
# Reset with backup
./maestro-cli.sh reset-db --backup
# Force reset (skip confirmation)
./maestro-cli.sh reset-db --force
Note: Backup and restore operations are handled through reset-db --backup
or manual database operations. See the Database Reset Guide for details.
Performance Tips¶
GPU Utilization¶
# Check available GPUs
nvidia-smi
# Use specific GPU
./maestro-cli.sh ingest researcher ./docs --device cuda:0
# Use multiple GPUs (process in parallel)
./maestro-cli.sh ingest researcher ./docs1 --device cuda:0 &
./maestro-cli.sh ingest researcher ./docs2 --device cuda:1 &
wait
Batch Size Optimization¶
# Small files, increase batch size
./maestro-cli.sh ingest researcher ./small-docs --batch-size 10
# Large files, decrease batch size
./maestro-cli.sh ingest researcher ./large-pdfs --batch-size 2
Memory Management¶
# Limit memory usage
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512
./maestro-cli.sh ingest researcher ./documents
Troubleshooting¶
Common Issues¶
Permission Denied:
Docker Not Running:
GPU Not Available:
Out of Memory:
Log Files¶
Check logs for errors:
Next Steps¶
- Database Management - Database operations
- User Guide - Using the web interface
- Troubleshooting - Common issues
- API Reference - REST API documentation