Welcome to Speakr¶
Speakr is a powerful self-hosted transcription platform that helps you capture, transcribe, and understand your audio content. Whether you're recording meetings, interviews, lectures, or personal notes, Speakr transforms spoken words into valuable, searchable knowledge.
Latest Release: v0.6.3 - API Token Authentication
New Feature - Programmatic API access for automation tools
- API Tokens - Create personal access tokens for programmatic API access (n8n, Zapier, scripts)
- Multiple Auth Methods - Bearer token, X-API-Token header, API-Token header, or query parameter
- Token Management - Create, revoke, and track token usage from Account Settings
- Flexible Expiration - Set custom expiration periods or create non-expiring tokens
- Secure Storage - Tokens are hashed (SHA-256) and never stored in plaintext
✅ Fully backward compatible with v0.6.x. No configuration changes required. View full release notes
Quick Navigation¶
Core Features¶
🎙️ Smart Recording
- Audio capture from mic or system
- Take notes while recording
- Generate smart summaries
🤖 AI Transcription
- Multi-language support
- Speaker identification
- Voice profiles with AI recognition
- Custom vocabularies
🔍 Intelligent Search
- Semantic search
- Natural language queries
- Cross-recording search
📊 Organization
🌍 International
- 5+ languages supported
- Automatic UI translation
- Localized summaries
🔒 Privacy First
🔑 API Access
- Personal access tokens
- Automation tool integration
- Secure token management
Interactive Audio Synchronization¶
Experience seamless bidirectional synchronization between your audio and transcript. Click any part of the transcript to jump directly to that moment in the audio, or watch as the system automatically highlights the currently spoken text as the audio plays. Enable auto-scroll follow mode to keep the active segment centered in view, creating an effortless reading experience for even the longest recordings.
Real-time transcript highlighting synchronized with audio playback, with auto-scroll follow mode
Learn more about audio synchronization features in the user guide.
Transform Your Recordings with Custom Tag Prompts
Tags aren't just for organization - they transform content. Create a "Recipe" tag to convert cooking narration into formatted recipes. Use "Study Notes" tags to turn lecture recordings into organized outlines. Stack tags like "Client Meeting" + "Legal Review" for combined analysis. Learn more in the Custom Prompts guide.
Latest Updates¶
Version 0.6.5 - Separate Chat Model Configuration
New Feature - Configure different AI models for chat vs background tasks
- Separate Chat Model - Use different service tiers for chat and summarization (#143)
- Custom Datetime Picker - New themed calendar and time selection modal
- Bug Fixes - Audio chunking after refactor (#140), username display (#138)
✅ Fully backward compatible. Optional CHAT_MODEL_* environment variables.
Version 0.6.3 - API Token Authentication
New Feature - Programmatic API access for automation tools
- API Tokens - Create personal access tokens for programmatic API access
- Multiple Auth Methods - Bearer token, X-API-Token header, API-Token header, or query parameter
- Token Management - Create, revoke, and track token usage from Account Settings
- Flexible Expiration - Set custom expiration periods or create non-expiring tokens
- Secure Storage - Tokens are hashed (SHA-256) and never stored in plaintext
✅ Fully backward compatible with v0.6.x. No configuration changes required.
Version 0.6.2 - UX Polish & Bug Fixes
- Standardized modal UX with backdrop click and consistent X button placement
- Recording disclaimer markdown support
- IndexedDB crash recovery fixes
- Processing queue cleanup on delete
Version 0.6.1 - Offline Ready
- HuggingFace Model Caching - Embedding model persists across container restarts
- Offline Deployment - Run once with internet, then works fully offline
Version 0.6.0 - Queue Control
- Multi-User Job Queue - Fair round-robin scheduling with automatic retry for failed jobs
- Unified Progress Tracking - Single view merging uploads and backend processing
- Media Support - Added video format support and fixed Firefox system audio recording
Version 0.5.9 - Major Release
⚠️ Major architectural changes - Backup data before upgrading!
- Internal Sharing System - Share recordings with granular permissions (view/edit/reshare)
- Group Management - Create groups with leads, group tags, custom retention policies
- Speaker Voice Profiles - AI-powered recognition with embeddings (requires WhisperX)
- Audio-Transcript Sync - Click-to-jump, auto-highlight, and follow mode
- Auto-Deletion & Retention - Global and group-level policies with tag protection
- Modular Architecture - Backend refactored into blueprints, frontend composables
Previous release (v0.5.8):
- Inline Transcript Editing - Edit speaker assignments and text directly in the speaker identification modal
- Add Speaker Functionality - Dynamically add new speakers during transcript review
- Enhanced Speaker Modal - Improved UX with hover-based edit controls and real-time updates
Previous release (v0.5.7):
- GPT-5 Support - Full support for OpenAI's GPT-5 model family with automatic parameter detection
- Custom Summary Prompts on Reprocessing - Experiment with different prompts when regenerating summaries
- PWA Enhancements - Service worker for wake lock to prevent screen sleep on mobile
Previous release (v0.5.6):
- Event extraction for automatically identifying calendar-worthy events
- Transcript templates for customizable download formats
- Enhanced export options and improved mobile UI
Getting Help¶
Need assistance? We're here to help:
📖 Documentation
You're already here! Browse our comprehensive guides:
Ready to transform your audio into actionable insights? Get started now →