- Python 100%
| bin | ||
| src/hotmic | ||
| .gitignore | ||
| CLAUDE.md | ||
| pyproject.toml | ||
| README.md | ||
hotmic
CLI tool that keeps a rolling mic buffer and saves the last N minutes of audio on demand — with transcription, speaker diarization, and AI-powered summaries.
When something worth keeping happens, hit a hotkey (or CLI commands) and the last X minutes get written to a WAV file. Optionally transcribe it, identify speakers, and generate meeting notes automatically.
Start hotmic listen and it keeps the microphone recording into a fixed-size rolling buffer. As new audio comes in, the oldest audio gets discarded, the buffer always holds the most recent N minutes. At any point, you can save the last X minutes (where X <= N) to a WAV file. Nothing is written to disk until explicitly asked.
Note
This project was built and tested for macOS. It most likely only works on macOS, especially system audio capture, which depends on macOS Core Audio taps.
Install (MacOS)
brew install portaudio
pip install -e .
With optional features:
pip install -e '.[transcribe]' # + mlx-whisper for transcription
pip install -e '.[vad]' # + silero-vad for live transcription / speech detection
pip install -e '.[diarize]' # + transcription + speaker diarization
pip install -e '.[all]' # everything
Usage
# Start listening with a 30-minute rolling buffer
hotmic listen --buffer 30
# In another terminal (or via hotkey):
hotmic save 5 # save last 5 minutes
hotmic save 5 --name "Weekly Review"
hotmic save # save entire buffer
hotmic buffer 120 # increase live buffer to 120 minutes
hotmic pause # mute mic
hotmic resume # unmute
hotmic status # buffer stats (prints in listen terminal)
Interactive commands also work directly in the listen terminal: save [min] --name "Meeting Name", buffer <min>, pause, resume, status, q.
System audio capture (meeting recording)
Capture both your mic and system audio (Zoom/Meet/Teams) without touching your audio routing:
hotmic listen --buffer 60 --system-audio --diarize --summarize
Uses audiotee to passively tap system audio via macOS Core Audio taps (macOS 14.2+). Your meeting runs normally — no virtual audio drivers, no aggregate devices, no interference.
First run will prompt for "System Audio Recording" permission in System Settings.
When system audio capture is enabled, each save writes:
audio.wav— mixed mono mic + system audiomic.wav— microphone onlysystem.wav— system audio onlyaudio_stereo.wav— stereo split, mic on left and system audio on rightmetadata.json— meeting name, save time, duration, sample rate, and file list
Transcription & summarization
Transcription is on by default, gated by Silero VAD:
- Live transcript: while listening, speech is detected and segmented into
utterances, each transcribed as it finishes and appended to a per-session
live_<timestamp>.txtin the output directory. Silence is never sent to whisper. - On save: every save is transcribed (
.txt+.srt), unless VAD finds no speech in it — then transcription is skipped.
# Default: live VAD-gated transcription + transcribe every save
hotmic listen --buffer 30
# Disable all transcription
hotmic listen --buffer 30 --no-transcribe
# Auto-transcribe with speaker diarization
hotmic listen --buffer 30 --diarize
# Auto-transcribe + diarize + generate meeting notes
hotmic listen --buffer 30 --diarize --summarize
# Transcribe an existing file
hotmic transcribe recording.wav
hotmic transcribe hotmic_20260429_103000/mic.wav
hotmic transcribe hotmic_20260429_103000/system.wav
hotmic transcribe recording.wav --diarize
# Summarize an existing transcript
hotmic summarize recording.txt
Transcription uses mlx-whisper (Apple Silicon optimized). Diarization runs locally via diarize, or is offloaded to a remote GPU service (see below). Summarization uses claude -p.
Remote diarization (offload to a GPU box)
Diarization is the most compute-heavy step. Instead of running it on the Mac,
you can offload it to a self-hosted pyannote
service — the diarization-service
project (a separate repo, meant for a GPU machine). Point hotmic at it:
export HOTMIC_DIARIZE_URL=https://diarize.example.com
export HOTMIC_DIARIZE_TOKEN=<shared secret>
hotmic listen --diarize # diarization now runs on the remote GPU
When set, hotmic downsamples each save to 16 kHz mono and POSTs it to the
service (bearer-token auth), running the request concurrently with local
transcription so recording and live transcription are never blocked. If the
service is unreachable the transcript is still written, just without speaker
labels — there is no silent local fallback. The service, its API, and full
deployment/security details live in the separate diarization-service project
(DEPLOY.md there).
Bookmarks
Drop timestamp markers during recording, then save specific segments:
# From another terminal (or via hotkey):
hotmic mark meeting-start # drop a named bookmark
hotmic mark meeting-end # drop another
hotmic marks # list all marks (in listen terminal)
hotmic save --since-mark --name "Customer Call" # save from last mark to now
hotmic save --between-marks --name "Design Review" # save between last two marks
Interactive commands: mark [label], marks, save [min] --name "Meeting Name", save --since-mark, save --between-marks.
Growing the live buffer
Increase retention at runtime without restarting listen:
hotmic buffer 120
This preserves audio that is still in the current rolling buffer and allows future audio to fill the larger capacity. Audio already overwritten before the resize cannot be recovered. Shrinking is not supported while recording.
skhd integration
cmd + shift - s : hotmic save 5
cmd + shift - a : hotmic save
cmd + shift - m : hotmic mark
cmd + shift - p : hotmic pause
cmd + shift - r : hotmic resume
Options
-b --buffer=<min> Buffer size in minutes [default: 5]
-o --output=<dir> Output directory [default: ./recordings]
-r --rate=<hz> Sample rate in Hz [default: 44100]
--system-audio Capture system audio via audiotee (macOS 14.2+)
--name=<name> Meeting name to prefix the save directory
--no-transcribe Disable transcription (live and on save); on by default
--diarize Identify speakers (requires diarize package)
--summarize Generate meeting notes (requires claude CLI)
Use --rate 16000 if you only care about voice — cuts memory ~2.75x.
Memory
RAM grows lazily. Peak when buffer is full:
| Buffer | 44100 Hz | 16000 Hz |
|---|---|---|
| 5 min | 26 MB | 9 MB |
| 30 min | 159 MB | 58 MB |
| 60 min | 317 MB | 115 MB |