Introducing WisperCode
WisperCode Team · February 7, 2026 · 12 min read
TL;DR: WisperCode is a free, open-source voice dictation app that runs OpenAI's Whisper model entirely on your machine. No audio leaves your device, no account is required, and no internet connection is needed after the initial model download. It works on macOS and Windows, supports four hotkey modes, and includes features like context-aware styling, filler word removal, vocabulary hints, and voice snippets.
We built WisperCode because we wanted voice dictation that respects your privacy. Every existing solution sends your audio to the cloud. We thought there had to be a better way.
Why Local Matters
When you speak into WisperCode, your audio never leaves your machine. There is no server receiving your words, no company storing your transcriptions, no network request carrying your voice across the internet.
This is not just a privacy feature. It is the architecture. For a deeper look at why this matters, read our privacy-first voice dictation guide.
- No internet required — works offline, on a plane, anywhere
- No latency — transcription starts the moment you stop speaking
- No accounts — download, install, and start talking
- No data collection — we literally cannot see what you dictate
For professionals handling sensitive information, including medical records, legal documents, and financial data, local processing is not a nice-to-have. It is a compliance requirement. Cloud-based dictation creates risks under HIPAA, GDPR, and financial regulations that local processing eliminates by design. Our guide on voice dictation for sensitive documents covers these considerations in detail.
How It Works
WisperCode runs OpenAI's Whisper speech recognition model directly on your hardware. Press a hotkey, speak, and your words appear wherever your cursor is.
The entire pipeline — audio capture, transcription, text processing, and insertion — happens locally. The only network request WisperCode ever makes is to download the Whisper model on first launch.
Here is what happens under the hood when you press the hotkey and speak:
-
Audio capture. WisperCode activates your microphone and records raw audio using your system's default input device. The audio stays in memory on your machine. You can use any microphone, from a built-in laptop mic to a dedicated USB condenser. See our microphone recommendations for the best results.
-
Voice activity detection. Before passing audio to Whisper, WisperCode runs voice activity detection (VAD) to trim silence from the beginning and end of your recording. This prevents Whisper from hallucinating text during silent segments, a known limitation of transformer-based models.
-
Whisper transcription. The trimmed audio is processed by the Whisper model running on your CPU or GPU. Whisper converts the mel spectrogram of your speech into text using an encoder-decoder transformer architecture trained on 680,000 hours of multilingual audio data. The model you choose determines the speed and accuracy trade-off. See our model size comparison for details.
-
Text processing. The raw transcription passes through several processing stages. Filler word removal strips "um," "uh," and similar disfluencies. Vocabulary hints correct domain-specific terms. Context-aware styling adjusts formatting based on the active application. Snippets are expanded if triggered.
-
Text insertion. The processed text is typed into whatever application has focus, at your cursor position. WisperCode simulates keystrokes at the OS level, so it works with any application that accepts text input — code editors, email clients, chat apps, word processors, browsers, and terminal emulators.
The result is that you press a key, speak naturally, release the key, and clean, formatted text appears where you need it. The entire process takes one to three seconds for a typical sentence, depending on the model you use.
Key Features at a Glance
WisperCode is more than a simple transcription tool. Here is what sets it apart from other voice dictation options.
Local AI transcription. All speech recognition runs on your device using OpenAI's Whisper model. Choose from five model sizes, from tiny (75 MB, near-instant) to large-v3 (3 GB, maximum accuracy). No cloud processing, no API costs, no internet dependency. See Whisper model sizes compared for help choosing the right model for your hardware.
Four hotkey modes. WisperCode supports four ways to trigger dictation, so you can match your workflow:
- Hold mode — hold the hotkey to record, release to stop. Best for quick dictation bursts like commit messages, chat replies, and short notes.
- Toggle mode — press once to start recording, press again to stop. Better for longer dictation sessions like drafting emails, documentation, or articles.
- Press mode — press the hotkey to start, and WisperCode stops automatically when you pause speaking. Hands-free after the initial keypress.
- Double-press mode — double-tap the hotkey to start. Useful for avoiding accidental triggers in applications where the hotkey might conflict.
Context-aware styling. WisperCode detects which application is active and adjusts formatting to match. Dictating into Slack? Casual tone, lowercase starts. Writing an email? Professional formatting with proper capitalization. Working in your IDE? Code-friendly output that respects technical casing. Read the full context-aware styling guide for customization options.
Filler word removal. Natural speech includes "um," "uh," "like," "you know," and other hesitation markers. WisperCode automatically removes these during transcription, giving you clean text without manual editing. You speak naturally; WisperCode delivers polished output.
Vocabulary hints. Teach WisperCode your technical terms, brand names, medical jargon, or legal terminology. These hints are passed to Whisper as initial prompts, dramatically improving accuracy for specialized vocabulary. Without hints, Whisper might hear "Kubernetes" as "cube and eighties." With hints, it gets it right every time. The vocabulary hints guide walks through setup and best practices.
Voice snippets. Define short trigger phrases that expand into full text blocks. Say "sign off" and get your complete email signature. Say "bug template" and get a formatted bug report. Say "lgtm" and get your standard code review approval message. Our snippets guide covers everything from basic setup to advanced patterns.
Voice notes. Capture quick thoughts, reminders, and ideas by voice. Voice notes are stored locally in WisperCode's built-in notes manager, searchable and organized, without needing a separate app. Think of it as a private, voice-powered scratch pad that lives alongside your dictation tool.
Who Is WisperCode For
WisperCode is designed for anyone who types regularly and values their privacy. Here are the groups that benefit most.
Software developers. You spend a surprising amount of time writing natural language, not code. Documentation, commit messages, PR descriptions, code reviews, Slack threads, meeting notes, architecture decision records. Voice dictation handles all of that roughly 3x faster than typing, and vocabulary hints ensure technical terms like Kubernetes, PostgreSQL, and gRPC come through correctly. WisperCode's context-aware styling automatically adjusts formatting for your IDE versus Slack versus email, so you do not need to think about tone or capitalization. Read our complete developer setup guide.
Writers and content creators. First drafts flow faster when you speak them. Voice dictation helps you overcome writer's block by lowering the barrier to getting words on the page. Instead of staring at a blank screen, you just talk. WisperCode's filler removal and context styling produce drafts that need minimal editing, letting you focus on revision rather than raw output. See voice dictation for writers for workflow tips and techniques.
Medical, legal, and financial professionals. These fields require dictation tools that never transmit audio to external servers. Patient health records, attorney-client privileged communications, and financial data carry specific legal protections that cloud processing puts at risk. WisperCode's local-only architecture eliminates compliance concerns by design. No Business Associate Agreement needed. No data processing addendum to negotiate. Our guide covers voice dictation for sensitive documents in depth.
Remote workers. If you work from home, text is your primary communication medium. Slack messages, emails, documentation, meeting notes, Jira tickets, pull request comments — the typing volume is relentless. Voice dictation cuts typing time by 60-70% and reduces RSI risk. And because WisperCode runs locally, you do not need to worry about your employer's sensitive information passing through a third-party cloud service. Read voice dictation for remote workers for home office setup tips.
People with accessibility needs or RSI concerns. For users with repetitive strain injury, carpal tunnel syndrome, tendinitis, or other conditions that make extended typing painful, voice dictation provides an alternative input method that gives hands and wrists time to recover. You do not need to replace typing entirely. Alternating between voice and keyboard throughout the day is enough to reduce strain significantly. See our RSI prevention guide for strategies that combine voice and keyboard input.
Getting Started
Getting from download to your first dictation takes about five minutes.
-
Download. Visit the download page and grab the installer for macOS or Windows.
-
Install and grant permissions. On macOS, grant microphone access and accessibility permissions when prompted. Accessibility access is required so WisperCode can type text into your active application. On Windows, run the installer and allow microphone access when prompted.
-
Choose a model. WisperCode downloads a Whisper model on first launch. The base model (150 MB) is the default and works well on any modern machine. If you want better accuracy and have RAM to spare, the small model (500 MB) is a worthwhile upgrade. You can change models at any time in Settings.
-
Configure your hotkey. The default is
Ctrl+Space. Choose the hotkey mode that matches your workflow: hold, toggle, press, or double-press. If the default conflicts with your IDE's autocomplete, remap it to something likeCtrl+Shift+Spaceor a function key. -
Start talking. Open any application, place your cursor where you want text to appear, activate your hotkey, and speak naturally. Release (or press again, depending on mode), and your words appear.
For a detailed walkthrough with platform-specific instructions and screenshots, see our setup guide for Mac and Windows or the 5-minute getting started guide.
Why Not Just Use macOS Dictation or Google Voice Typing?
Built-in dictation tools work, but they come with trade-offs that matter for serious use.
Privacy. Apple Dictation and Google Voice Typing both send audio to the cloud by default. Apple has added on-device processing for some languages on newer hardware, but it is not guaranteed, and the user has limited visibility into when audio is being sent versus processed locally. WisperCode is local-only, always.
Features. Built-in dictation tools are basic. They transcribe speech and insert text. They do not offer vocabulary hints for technical terms, filler word removal, context-aware styling that adapts to your application, voice snippets, or configurable hotkey modes. These features make the difference between a tool you try once and a tool you use all day.
Consistency. WisperCode works the same way on macOS and Windows. If you switch between platforms or use both, you get the same workflow, the same hotkeys, the same features. Built-in tools vary significantly between operating systems.
For a detailed head-to-head, see WisperCode vs macOS Dictation. For a broader view of the landscape, our best voice dictation software for 2026 roundup covers every major option.
What Is Next
WisperCode is in beta for macOS and Windows. We are actively working on:
- More Whisper model options for different hardware profiles
- Expanded vocabulary hint capabilities
- Additional hotkey customization
- Performance optimizations for older hardware
We are building WisperCode in the open and listening closely to feedback. The features on this list are shaped by what beta users tell us matters most.
Frequently Asked Questions
What Whisper models does WisperCode support?
WisperCode supports all five Whisper model sizes: tiny (39M parameters, 75 MB), base (74M, 150 MB), small (244M, 500 MB), medium (769M, 1.5 GB), and large-v3 (1.55B, 3 GB). You can switch between models at any time in Settings. Models are downloaded once and cached locally. See Whisper model sizes compared for a detailed breakdown of accuracy, speed, and hardware requirements for each model.
Is WisperCode really free?
Yes. WisperCode is free during the beta period on both macOS and Windows. There are no usage limits, no premium tiers, and no per-minute transcription costs. Because all processing happens locally using the open-source Whisper model, there are no cloud API costs to pass along. The Whisper model itself is released under the MIT license and is free for any use.
What platforms does WisperCode run on?
WisperCode runs on macOS (Apple Silicon and Intel) and Windows (10 and 11). It works with any application that accepts text input, including code editors, browsers, email clients, chat apps, word processors, and terminal emulators. There is no plugin or extension required for any application. WisperCode types text at the OS level, so from your application's perspective, it looks like normal keyboard input.
Does WisperCode work completely offline?
Yes. After the one-time model download (which requires an internet connection), WisperCode runs entirely offline. No internet connection is needed for recording, transcription, or text insertion. You can use it on an airplane, in a secure facility, or anywhere without connectivity. For a deeper explanation of why local processing matters, read why local speech recognition changes everything.
How does WisperCode compare to macOS Dictation or Google Voice Typing?
Built-in dictation tools from Apple and Google send your audio to the cloud for processing. WisperCode processes everything locally. This means better privacy, offline capability, and no usage limits. WisperCode also offers features that built-in dictation tools lack: vocabulary hints, context-aware styling, filler word removal, voice snippets, and configurable hotkey modes. For a detailed comparison with macOS Dictation specifically, see WisperCode vs macOS Dictation. For a broader comparison across dictation tools, see our best voice dictation software for 2026 roundup.
Download WisperCode today and let us know what you think. New to voice dictation? Start with our setup guide for Mac and Windows, or explore free voice-to-text tools to see how WisperCode fits into the broader landscape.
Try WisperCode free during beta → Download