docs: sync README with v0.1.2 (wake list, editing cmds, auto_target, console)

reconcile README with the shipped code and with CLAUDE.md: full 6-phrase wake list
(claudedo/claude do/claude due/hey claude/ok claude/okay claude) with the Whisper
rationale; space/backspace/erase in the grammar + flow; colored prefixed console
output description; fix the auto_target contradiction (default false = require
set/target, not auto-pick); drop the stale 'backgroundable'.

Signed-off-by: disqualifier <dev@disqualifier.me>
This commit is contained in:
disqualifier 2026-06-26 01:22:00 -04:00
parent d96dc3898f
commit 08bbe3ce58

View File

@ -11,7 +11,7 @@ hands-free while another window (a game) is focused.
It exists because Claude Code's native `/voice` is hardcoded-blocked in WSL (it It exists because Claude Code's native `/voice` is hardcoded-blocked in WSL (it
assumes WSL has no audio). Modern WSL2 + WSLg *does* have working mic input via assumes WSL has no audio). Modern WSL2 + WSLg *does* have working mic input via
PulseAudio/RDP. `claudedo` captures the mic itself, transcribes on-device, and drives PulseAudio/RDP. `claudedo` captures the mic itself, transcribes on-device, and drives
Claude Code over tmux — fully local, private, backgroundable. Claude Code over tmux — fully local and private. You run it in a terminal you watch.
## How it works ## How it works
@ -20,9 +20,11 @@ mic (WSLg/PulseAudio RDPSource)
-> sounddevice capture -> sounddevice capture
-> faster-whisper (local STT, on-device) -> faster-whisper (local STT, on-device)
-> wake gate: utterance must start with a wake phrase, else DISCARD locally -> wake gate: utterance must start with a wake phrase, else DISCARD locally
-> grammar match (yes/no/one..four/approve/deny/send/type/mode/set/target/cancel) -> grammar match (yes/no/one..four/approve/deny/send/type/space/backspace/erase/
-> resolve target session (~/.claude-active) mode/set/target/unset/list/cancel)
-> resolve target session (one-shot > sticky ~/.claude-active > auto/none)
-> tmux send-keys -t <session> "<keys>" -> tmux send-keys -t <session> "<keys>"
-> log the action to the watched terminal ([session]/[SYSTEM]/[VOICE], colored)
``` ```
**Privacy by construction.** STT runs on-device. In listen mode, any speech that **Privacy by construction.** STT runs on-device. In listen mode, any speech that
@ -62,11 +64,13 @@ claudedo test-audio
## Usage ## Usage
**Run it in a terminal you watch — that's the product.** You launch `claudedo **Run it in a terminal you watch — that's the product.** You launch `claudedo
start`, it does a quick mic check, then drops into a visible listen loop that prints start`, it does a quick mic check, then drops into a visible listen loop. Each
`heard → matched → sent` for every utterance. That terminal is your utterance prints a timestamped, colored line — `HH:MM:SS [claude-libs] heard "…" →
recognition/action console; you attach to the `claude-<name>` session in another pane typed 'fix'` (green for injected, red for drops, `[SYSTEM]`/`[VOICE]` for state and
to watch the keystrokes land. There is no backgrounding/daemon mode — the whole point recognition). That terminal is your recognition/action console; you attach to the
is the console you read. `claude-<name>` session in another pane to watch the keystrokes land. It runs in the
foreground by design — the console is the point — though `claudedo stop` can signal a
stray instance.
```bash ```bash
claudedo start # mic-check, then the visible listen loop (listen mode default) claudedo start # mic-check, then the visible listen loop (listen mode default)
@ -97,9 +101,12 @@ Switch at runtime by voice: "claudedo mode listen" / "claudedo mode ptt".
## Command grammar ## Command grammar
Wake phrases (listen mode), fuzzy-matched: **"claudedo"**, **"hey claude"**. Wake phrases (listen mode), fuzzy-matched. The default list is **"claudedo"**,
"claudedo" is a coined word, so the matcher is lenient (accepts "claude do", **"claude do"**, **"claude due"**, **"hey claude"**, **"ok claude"**, **"okay
"clauddo", "cloud do", …). In PTT mode the wake phrase is optional. claude"** — Whisper has no token for the coined word "claudedo" and renders it as
real words ("claude do"/"claude due"), so those spellings are listed explicitly.
Matching is lenient (case/space-insensitive). Add the spellings you actually see
(turn on `print_heard` to find them). In PTT mode the wake phrase is optional.
| Say | Does | | Say | Does |
|---|---| |---|---|
@ -121,8 +128,10 @@ Wake phrases (listen mode), fuzzy-matched: **"claudedo"**, **"hey claude"**.
Optional filler (`select` / `use` / `choose`) may precede any command and is ignored: Optional filler (`select` / `use` / `choose`) may precede any command and is ignored:
`select yes` and `use yes` behave like `yes`. (`select 1` is still the select command.) `select yes` and `use yes` behave like `yes`. (`select 1` is still the select command.)
When no sticky target is set, a bare command auto-targets the **only** running When no sticky target is set, a bare command does nothing and asks you to `set` one
`claude-*` session; if several are running it does nothing and asks you to `set` one. (the default). Set `auto_target = true` to instead auto-use the single running
`claude-*` session when there's exactly one; with several running it always does
nothing and asks you to `set` one.
Number words are normalized to digits before matching ("one"/"won" → 1). Number words are normalized to digits before matching ("one"/"won" → 1).
@ -135,9 +144,9 @@ A `target <name>` voice command is a **one-shot** that does NOT touch the sticky
default — it routes a single command and the next bare command reverts to sticky. default — it routes a single command and the next bare command reverts to sticky.
Resolution order (one place — `target.resolve()`): one-shot if present → Resolution order (one place — `target.resolve()`): one-shot if present →
sticky if set and the session exists → else the only running `claude-*` session → sticky if set and the session exists → else, only if `auto_target = true`, the single
else (zero or several) do nothing and say so. It never guesses, and never injects running `claude-*` session → else (default, or zero/several sessions) do nothing and
into a nonexistent session. say so. It never guesses, and never injects into a nonexistent session.
Every name maps to `claude-<name>` through one helper (`target.session_name()`), and Every name maps to `claude-<name>` through one helper (`target.session_name()`), and
the cc kit mirrors it exactly — so `cc libs` (shell) and `set libs` (voice) refer the cc kit mirrors it exactly — so `cc libs` (shell) and `set libs` (voice) refer