mirror of https://github.com/vipul-sharma20/hotmic.git synced 2026-06-28 07:13:01 +00:00

CLI to record rolling mic buffer and save the last N minutes of audio on demand.

Python 100%

Find a file

Vipul Sharma 44622a8cfa feat: speaker diarization using remote diarization service		2026-06-13 15:55:11 +05:30
bin	feat: add audiotee binary for system audio capture	2026-04-13 02:32:47 +05:30
src/hotmic	feat: speaker diarization using remote diarization service	2026-06-13 15:55:11 +05:30
.gitignore	chore: add gitignore	2026-04-13 02:32:23 +05:30
CLAUDE.md	update: CLAUDE.md	2026-05-26 18:54:45 +05:30
pyproject.toml	feat: speaker diarization using remote diarization service	2026-06-13 15:55:11 +05:30
README.md	feat: speaker diarization using remote diarization service	2026-06-13 15:55:11 +05:30

README.md

hotmic

CLI tool that keeps a rolling mic buffer and saves the last N minutes of audio on demand — with transcription, speaker diarization, and AI-powered summaries.

When something worth keeping happens, hit a hotkey (or CLI commands) and the last X minutes get written to a WAV file. Optionally transcribe it, identify speakers, and generate meeting notes automatically.

Start hotmic listen and it keeps the microphone recording into a fixed-size rolling buffer. As new audio comes in, the oldest audio gets discarded, the buffer always holds the most recent N minutes. At any point, you can save the last X minutes (where X <= N) to a WAV file. Nothing is written to disk until explicitly asked.

Note

This project was built and tested for macOS. It most likely only works on macOS, especially system audio capture, which depends on macOS Core Audio taps.

Install (MacOS)

brew install portaudio
pip install -e .

With optional features:

pip install -e '.[transcribe]'   # + mlx-whisper for transcription
pip install -e '.[vad]'          # + silero-vad for live transcription / speech detection
pip install -e '.[diarize]'      # + transcription + speaker diarization
pip install -e '.[all]'          # everything

Usage

# Start listening with a 30-minute rolling buffer
hotmic listen --buffer 30

# In another terminal (or via hotkey):
hotmic save 5           # save last 5 minutes
hotmic save 5 --name "Weekly Review"
hotmic save             # save entire buffer
hotmic buffer 120       # increase live buffer to 120 minutes
hotmic pause            # mute mic
hotmic resume           # unmute
hotmic status           # buffer stats (prints in listen terminal)

Interactive commands also work directly in the listen terminal: save [min] --name "Meeting Name", buffer <min>, pause, resume, status, q.

System audio capture (meeting recording)

Capture both your mic and system audio (Zoom/Meet/Teams) without touching your audio routing:

hotmic listen --buffer 60 --system-audio --diarize --summarize

Uses audiotee to passively tap system audio via macOS Core Audio taps (macOS 14.2+). Your meeting runs normally — no virtual audio drivers, no aggregate devices, no interference.

First run will prompt for "System Audio Recording" permission in System Settings.

When system audio capture is enabled, each save writes:

audio.wav — mixed mono mic + system audio
mic.wav — microphone only
system.wav — system audio only
audio_stereo.wav — stereo split, mic on left and system audio on right
metadata.json — meeting name, save time, duration, sample rate, and file list

Transcription & summarization

Transcription is on by default, gated by Silero VAD:

Live transcript: while listening, speech is detected and segmented into utterances, each transcribed as it finishes and appended to a per-session live_<timestamp>.txt in the output directory. Silence is never sent to whisper.
On save: every save is transcribed (.txt + .srt), unless VAD finds no speech in it — then transcription is skipped.

# Default: live VAD-gated transcription + transcribe every save
hotmic listen --buffer 30

# Disable all transcription
hotmic listen --buffer 30 --no-transcribe

# Auto-transcribe with speaker diarization
hotmic listen --buffer 30 --diarize

# Auto-transcribe + diarize + generate meeting notes
hotmic listen --buffer 30 --diarize --summarize

# Transcribe an existing file
hotmic transcribe recording.wav
hotmic transcribe hotmic_20260429_103000/mic.wav
hotmic transcribe hotmic_20260429_103000/system.wav
hotmic transcribe recording.wav --diarize

# Summarize an existing transcript
hotmic summarize recording.txt

Transcription uses mlx-whisper (Apple Silicon optimized). Diarization runs locally via diarize, or is offloaded to a remote GPU service (see below). Summarization uses claude -p.

Remote diarization (offload to a GPU box)

Diarization is the most compute-heavy step. Instead of running it on the Mac, you can offload it to a self-hosted pyannote service — the diarization-service project (a separate repo, meant for a GPU machine). Point hotmic at it:

export HOTMIC_DIARIZE_URL=https://diarize.example.com
export HOTMIC_DIARIZE_TOKEN=<shared secret>
hotmic listen --diarize        # diarization now runs on the remote GPU

When set, hotmic downsamples each save to 16 kHz mono and POSTs it to the service (bearer-token auth), running the request concurrently with local transcription so recording and live transcription are never blocked. If the service is unreachable the transcript is still written, just without speaker labels — there is no silent local fallback. The service, its API, and full deployment/security details live in the separate diarization-service project (DEPLOY.md there).

Bookmarks

Drop timestamp markers during recording, then save specific segments:

# From another terminal (or via hotkey):
hotmic mark meeting-start    # drop a named bookmark
hotmic mark meeting-end      # drop another
hotmic marks                 # list all marks (in listen terminal)
hotmic save --since-mark --name "Customer Call"     # save from last mark to now
hotmic save --between-marks --name "Design Review"  # save between last two marks

Interactive commands: mark [label], marks, save [min] --name "Meeting Name", save --since-mark, save --between-marks.

Growing the live buffer

Increase retention at runtime without restarting listen:

hotmic buffer 120

This preserves audio that is still in the current rolling buffer and allows future audio to fill the larger capacity. Audio already overwritten before the resize cannot be recovered. Shrinking is not supported while recording.

skhd integration

cmd + shift - s : hotmic save 5
cmd + shift - a : hotmic save
cmd + shift - m : hotmic mark
cmd + shift - p : hotmic pause
cmd + shift - r : hotmic resume

Options

-b --buffer=<min>   Buffer size in minutes [default: 5]
-o --output=<dir>   Output directory [default: ./recordings]
-r --rate=<hz>      Sample rate in Hz [default: 44100]
--system-audio      Capture system audio via audiotee (macOS 14.2+)
--name=<name>       Meeting name to prefix the save directory
--no-transcribe     Disable transcription (live and on save); on by default
--diarize           Identify speakers (requires diarize package)
--summarize         Generate meeting notes (requires claude CLI)

Use --rate 16000 if you only care about voice — cuts memory ~2.75x.

Memory

RAM grows lazily. Peak when buffer is full:

Buffer	44100 Hz	16000 Hz
5 min	26 MB	9 MB
30 min	159 MB	58 MB
60 min	317 MB	115 MB