Ghost Pepper: Local Hold-to-Talk Speech-to-Text for macOS

Ghost Pepper is a newly released open-source utility for macOS that provides a 100% local hold-to-talk speech-to-text experience. Designed for privacy-conscious users, the application ensures that no data leaves the machine by processing all audio and transcriptions on-device using Apple Silicon. By holding the Control key, users can record their voice, which is then automatically transcribed and pasted into any active text field upon release. The tool integrates smart cleanup features powered by local Large Language Models (LLMs) to remove filler words and correct errors. Supporting macOS 14.0 and above, Ghost Pepper utilizes WhisperKit and LLM.swift to deliver a seamless, menu-bar-based workflow without the need for cloud APIs or external data logging.

Key Takeaways

Privacy-First Architecture: Ghost Pepper runs 100% locally on macOS, ensuring no data is sent to cloud APIs or external servers.
Hold-to-Talk Workflow: Users can hold the Control key to record and release it to automatically transcribe and paste text into any field.
Local AI Processing: Utilizes WhisperKit for speech-to-text and local LLMs (like Qwen 3.5) for smart cleanup and self-correction.
Hardware Optimized: Specifically designed for Apple Silicon (M1+) and requires macOS 14.0 or later.
Customizable Experience: Offers various model sizes for both speech and cleanup, allowing users to balance speed and accuracy.

In-Depth Analysis

Local Processing and Privacy Framework

Ghost Pepper distinguishes itself by operating entirely on the user's local machine. By leveraging Apple Silicon, the application eliminates the need for cloud-based transcription services, which often raise privacy concerns. The system architecture ensures that transcriptions are never written to disk; debug logs remain in-memory and are cleared once the application quits. This "no-cloud" approach is supported by models served via Hugging Face, which are downloaded and cached locally upon first use.

Technical Implementation and Model Options

The application employs a dual-model system to ensure high-quality output. For speech recognition, it uses WhisperKit, offering models ranging from the ~75 MB "Whisper tiny.en" for maximum speed to the ~1.4 GB "Parakeet v3" for multilingual support. Following transcription, a secondary "Cleanup" phase occurs. Powered by LLM.swift and Qwen 3.5 models (ranging from 0.8B to 4B parameters), this phase removes filler words and handles self-corrections. Users can customize the cleanup prompt and select specific microphones to tailor the performance to their hardware capabilities.

User Interface and Accessibility

Designed as a lightweight menu bar app, Ghost Pepper avoids dock clutter and can be set to launch at login. Its primary interaction method—simulated keystrokes via Accessibility permissions—allows it to function across virtually any text-entry field on macOS. This global hotkey functionality, combined with the hold-to-talk mechanic, aims to streamline the transition from voice to written text without manual copying and pasting.

Industry Impact

The launch of Ghost Pepper highlights a growing trend toward decentralized, local AI tools that prioritize user privacy over cloud convenience. By utilizing specialized frameworks like WhisperKit and LLM.swift, the project demonstrates the increasing capability of consumer-grade hardware (Apple Silicon) to handle complex AI tasks like real-time transcription and LLM-based text refinement. This shift could influence how developers approach productivity tools, moving away from subscription-based API models toward open-source, on-device solutions that offer lower latency and higher data security.

Frequently Asked Questions

Question: What are the system requirements for Ghost Pepper?

Ghost Pepper requires a Mac with Apple Silicon (M1 chip or newer) and must be running macOS 14.0 or later. It also requires Microphone and Accessibility permissions to record audio and paste transcriptions.

Question: Does Ghost Pepper store any of my voice recordings or transcripts?

No. The application is designed with a privacy-first approach where no data leaves your machine. Transcriptions are never written to files, and debug logs are stored only in-memory, disappearing when the app is closed.

Question: Can I use Ghost Pepper for languages other than English?

Yes. While the default models are optimized for English, Ghost Pepper supports multilingual options, including the Whisper small (multilingual) model and the Parakeet v3 model, which supports 25 languages.

Ghost Pepper: A New Local Hold-to-Talk Speech-to-Text Solution for macOS Users