Compare commits
No commits in common. "main" and "v0.1.0" have entirely different histories.
1
.gitignore
vendored
1
.gitignore
vendored
@ -1,5 +1,4 @@
|
|||||||
CLAUDE.md
|
CLAUDE.md
|
||||||
COMPACT.md
|
|
||||||
|
|
||||||
__pycache__/
|
__pycache__/
|
||||||
*.pyc
|
*.pyc
|
||||||
|
|||||||
188
README.md
188
README.md
@ -11,7 +11,7 @@ hands-free while another window (a game) is focused.
|
|||||||
It exists because Claude Code's native `/voice` is hardcoded-blocked in WSL (it
|
It exists because Claude Code's native `/voice` is hardcoded-blocked in WSL (it
|
||||||
assumes WSL has no audio). Modern WSL2 + WSLg *does* have working mic input via
|
assumes WSL has no audio). Modern WSL2 + WSLg *does* have working mic input via
|
||||||
PulseAudio/RDP. `claudedo` captures the mic itself, transcribes on-device, and drives
|
PulseAudio/RDP. `claudedo` captures the mic itself, transcribes on-device, and drives
|
||||||
Claude Code over tmux — fully local and private. You run it in a terminal you watch.
|
Claude Code over tmux — fully local, private, backgroundable.
|
||||||
|
|
||||||
## How it works
|
## How it works
|
||||||
|
|
||||||
@ -20,11 +20,9 @@ mic (WSLg/PulseAudio RDPSource)
|
|||||||
-> sounddevice capture
|
-> sounddevice capture
|
||||||
-> faster-whisper (local STT, on-device)
|
-> faster-whisper (local STT, on-device)
|
||||||
-> wake gate: utterance must start with a wake phrase, else DISCARD locally
|
-> wake gate: utterance must start with a wake phrase, else DISCARD locally
|
||||||
-> grammar match (yes/no/one..four/approve/deny/send/type/space/backspace/erase/
|
-> grammar match (yes/no/one..four/approve/deny/send/type/mode/switch/cancel)
|
||||||
mode/set/target/unset/list/context/reload/system/cancel)
|
-> resolve target session (~/.claude-active)
|
||||||
-> resolve target session (one-shot > sticky ~/.claude-active > auto/none)
|
|
||||||
-> tmux send-keys -t <session> "<keys>"
|
-> tmux send-keys -t <session> "<keys>"
|
||||||
-> log the action to the watched terminal ([session]/[SYSTEM]/[VOICE], colored)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Privacy by construction.** STT runs on-device. In listen mode, any speech that
|
**Privacy by construction.** STT runs on-device. In listen mode, any speech that
|
||||||
@ -64,28 +62,20 @@ claudedo test-audio
|
|||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
**Run it in a terminal you watch — that's the product.** You launch `claudedo
|
**Run it in a terminal you watch — that's the product.** You launch `claudedo
|
||||||
start` and it drops into a visible listen loop (pass `--check` to run a mic check
|
start`, it does a quick mic check, then drops into a visible listen loop that prints
|
||||||
first). Each utterance prints a timestamped, colored line — `HH:MM:SS [claude-libs]
|
`heard → matched → sent` for every utterance. That terminal is your
|
||||||
heard "…" →
|
recognition/action console; you attach to the `claude-<name>` session in another pane
|
||||||
typed 'fix'` (green for injected, red for drops, `[SYSTEM]`/`[VOICE]` for state and
|
to watch the keystrokes land. There is no backgrounding/daemon mode — the whole point
|
||||||
recognition). That terminal is your recognition/action console; you attach to the
|
is the console you read.
|
||||||
`claude-<name>` session in another pane to watch the keystrokes land. It runs in the
|
|
||||||
foreground by design — the console is the point — though `claudedo stop` can signal a
|
|
||||||
stray instance.
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
claudedo start # the visible listen loop (listen mode default; no mic check)
|
claudedo start # mic-check, then the visible listen loop (listen mode default)
|
||||||
claudedo start --check # run a mic check before listening
|
|
||||||
claudedo start --mode ptt # push-to-talk instead (desk-only — see Modes)
|
claudedo start --mode ptt # push-to-talk instead (desk-only — see Modes)
|
||||||
|
claudedo start --skip-audio-check # skip the pre-listen mic check
|
||||||
claudedo status # running? mode? target session?
|
claudedo status # running? mode? target session?
|
||||||
claudedo stop # stop a running daemon
|
claudedo stop # stop a running daemon
|
||||||
claudedo reload # reload config.toml + contexts.toml in a running daemon
|
claudedo switch <name> # retarget to claude-<name>
|
||||||
claudedo set <name> # set the sticky target -> claude-<name> (alias: switch)
|
|
||||||
claudedo unset # clear the sticky target
|
|
||||||
claudedo list # list running claude-* sessions
|
|
||||||
claudedo cleanup # kill DETACHED claude-* sessions (never attached)
|
|
||||||
claudedo test-audio # verify the mic capture path
|
claudedo test-audio # verify the mic capture path
|
||||||
claudedo test-tone # play each earcon (verify the audio-OUT path)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Modes
|
### Modes
|
||||||
@ -105,14 +95,9 @@ Switch at runtime by voice: "claudedo mode listen" / "claudedo mode ptt".
|
|||||||
|
|
||||||
## Command grammar
|
## Command grammar
|
||||||
|
|
||||||
Wake phrases (listen mode), fuzzy-matched. The default list is **"claudedo"**,
|
Wake phrases (listen mode), fuzzy-matched: **"claudedo"**, **"hey claude"**.
|
||||||
**"claude do"**, **"hey claude"**, **"ok claude"**, **"okay claude"** — Whisper has
|
"claudedo" is a coined word, so the matcher is lenient (accepts "claude do",
|
||||||
no token for the coined word "claudedo" and renders it as real words ("claude do"),
|
"clauddo", "cloud do", …). In PTT mode the wake phrase is optional.
|
||||||
so that spelling is listed explicitly. Matching is lenient (case/space-insensitive).
|
|
||||||
Add the spellings you actually see (turn on `print_heard` to find them). In PTT mode
|
|
||||||
the wake phrase is optional. When a command's wake phrase matched loosely (e.g. you
|
|
||||||
said "okay clouds"), the heard line notes which phrase it assumed —
|
|
||||||
`heard "okay clouds list" -> LIST (wake: okay claude)`.
|
|
||||||
|
|
||||||
| Say | Does |
|
| Say | Does |
|
||||||
|---|---|
|
|---|---|
|
||||||
@ -121,50 +106,22 @@ said "okay clouds"), the heard line notes which phrase it assumed —
|
|||||||
| `approve` / `deny` | allow / deny a permission prompt |
|
| `approve` / `deny` | allow / deny a permission prompt |
|
||||||
| `send` / `enter` | submit (Enter) |
|
| `send` / `enter` | submit (Enter) |
|
||||||
| `type <phrase>` | insert literal text, **no** submit (read-before-send; say "send") |
|
| `type <phrase>` | insert literal text, **no** submit (read-before-send; say "send") |
|
||||||
| `space [<n>]` (also `add [a] space`, `insert <n> spaces`) | insert n spaces (default 1) |
|
|
||||||
| `backspace [<n>]` (alias `delete`) | delete n chars (default 1), capped at the last submit boundary |
|
|
||||||
| `erase` (alias `clear`/`wipe`) | delete everything typed since the last submit/boundary |
|
|
||||||
| `debug <text>` (alias `echo`) | just print what you said to the console (test wake/STT; injects nothing) |
|
|
||||||
| `mode ptt` / `mode listen` | switch input mode |
|
| `mode ptt` / `mode listen` | switch input mode |
|
||||||
| `set <name>` (alias `sticky`/`switch`) | set the **sticky** target → `claude-<name>` (persists) |
|
| `switch <name>` / `target <name>` | retarget to `claude-<name>` |
|
||||||
| `target <name> <command>` | **one-shot** override: run that command on `claude-<name>` for this utterance only; sticky default unchanged |
|
|
||||||
| `unset` (alias `unsticky`) | clear the sticky target |
|
|
||||||
| `list` | list running `claude-*` sessions to the daemon console |
|
|
||||||
| `context <name> <instruction>` (alias `prepare`) | inject a `contexts.toml` blurb as a preamble + the dictated instruction, then **wait** (no submit — say "send") |
|
|
||||||
| `reload` | re-read `config.toml` + `contexts.toml` live (no daemon restart, model stays loaded) |
|
|
||||||
| `system status` | print mode / target / model / context count to the console (daemon-control; never injects) |
|
|
||||||
| `system reload [config\|contexts]` | reload one or both config files |
|
|
||||||
| `cleanup` (alias `detached`/`detach`, also `system cleanup`) | kill **detached** `claude-*` sessions only — never an attached one |
|
|
||||||
| `commands` (alias `help`/`menu`) | print the voice-command menu to the console |
|
|
||||||
| `customs` (alias `custom`) | list the loaded context names |
|
|
||||||
| `version` | print the claudedo version to the console |
|
|
||||||
| `cancel` / `escape` | back out of a prompt |
|
| `cancel` / `escape` | back out of a prompt |
|
||||||
|
|
||||||
Optional filler (`select` / `use` / `choose`) may precede any command and is ignored:
|
|
||||||
`select yes` and `use yes` behave like `yes`. (`select 1` is still the select command.)
|
|
||||||
|
|
||||||
When no sticky target is set, a bare command does nothing and asks you to `set` one
|
|
||||||
(the default). Set `auto_target = true` to instead auto-use the single running
|
|
||||||
`claude-*` session when there's exactly one; with several running it always does
|
|
||||||
nothing and asks you to `set` one.
|
|
||||||
|
|
||||||
Number words are normalized to digits before matching ("one"/"won" → 1).
|
Number words are normalized to digits before matching ("one"/"won" → 1).
|
||||||
|
|
||||||
## Targeting
|
## Targeting
|
||||||
|
|
||||||
`~/.claude-active` holds the **sticky** target session name (e.g.
|
`~/.claude-active` holds the target session name (e.g. `claude-rethink-public`). The
|
||||||
`claude-rethink-public`). The **cc kit** writes this file when you attach, and
|
**cc kit** writes this file when you attach, so the target is "the project you most
|
||||||
`claudedo set <name>` (alias `sticky`/`switch`) overwrites it; `unset` clears it.
|
recently attached to". `claudedo switch <name>` / `target <name>` overwrites it. If
|
||||||
A `target <name>` voice command is a **one-shot** that does NOT touch the sticky
|
the file is missing or the session no longer exists, `claudedo` injects nothing and
|
||||||
default — it routes a single command and the next bare command reverts to sticky.
|
logs a warning (it never guesses a target).
|
||||||
|
|
||||||
Resolution order (one place — `target.resolve()`): one-shot if present →
|
|
||||||
sticky if set and the session exists → else, only if `auto_target = true`, the single
|
|
||||||
running `claude-*` session → else (default, or zero/several sessions) do nothing and
|
|
||||||
say so. It never guesses, and never injects into a nonexistent session.
|
|
||||||
|
|
||||||
Every name maps to `claude-<name>` through one helper (`target.session_name()`), and
|
Every name maps to `claude-<name>` through one helper (`target.session_name()`), and
|
||||||
the cc kit mirrors it exactly — so `cc libs` (shell) and `set libs` (voice) refer
|
the cc kit mirrors it exactly — so `cc libs` (shell) and `target libs` (voice) refer
|
||||||
to the same session `claude-libs`. The name is your **stable, speakable handle**:
|
to the same session `claude-libs`. The name is your **stable, speakable handle**:
|
||||||
because the kit forces an explicit name (no basename guessing), you always know the
|
because the kit forces an explicit name (no basename guessing), you always know the
|
||||||
exact word to say.
|
exact word to say.
|
||||||
@ -177,74 +134,9 @@ cc <name> # attach/create claude-<name>; writes ~/.claude-active
|
|||||||
ccr <name> # re-attach an existing claude-<name> only
|
ccr <name> # re-attach an existing claude-<name> only
|
||||||
ccl # list claude-* sessions
|
ccl # list claude-* sessions
|
||||||
cck <name> # kill claude-<name>
|
cck <name> # kill claude-<name>
|
||||||
ccclean # kill DETACHED claude-* sessions only (never attached) — safe cleanup
|
cckl # kill all claude-* sessions
|
||||||
cckl # kill ALL claude-* sessions (including attached)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Contexts (named reference blurbs)
|
|
||||||
|
|
||||||
`contexts.toml` holds named reference snippets you can inject ahead of a dictated
|
|
||||||
instruction with the **`context <name> <instruction>`** voice command (alias
|
|
||||||
`prepare`). It lives next to `config.toml`
|
|
||||||
(`$CLAUDEDO_CONTEXTS` → `~/.config/claudedo/contexts.toml` → `./contexts.toml`); a
|
|
||||||
missing file just means no contexts (the feature is opt-in).
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[contexts]
|
|
||||||
webhooks = "discord webhooks — test: <url> (safe to spam), live: <url> (real, careful)"
|
|
||||||
testing = "use the test/staging resources only, never touch prod"
|
|
||||||
```
|
|
||||||
|
|
||||||
Saying `context webhooks send a test message` injects the `webhooks` blurb as a
|
|
||||||
preamble, then the dictated instruction, and **waits** — nothing is auto-submitted. You
|
|
||||||
say `send` to submit (**read-before-send**; Claude's own permission prompt is the
|
|
||||||
backstop for anything consequential). A bare `context webhooks` injects just the blurb.
|
|
||||||
One context per command (no stacking yet); an unknown name announces and injects
|
|
||||||
nothing.
|
|
||||||
|
|
||||||
Names are **spoken and fuzzy-matched**, so keep them simple and distinct — they're
|
|
||||||
looked up on a despaced/lowercased key, so `web hooks` / `web-hooks` / `webhooks` all
|
|
||||||
resolve the same block. Assembly is config-gated: `behavior.context_multiline` (default
|
|
||||||
`true`) puts the blurb and instruction on separate lines via a Shift+Enter soft newline;
|
|
||||||
set it `false` to flatten onto one line with `context_separator` (default `" — "`) if
|
|
||||||
Shift+Enter is unreliable in your terminal.
|
|
||||||
|
|
||||||
Edit `contexts.toml`, then say **`reload`** (or run `claudedo reload`) — it re-reads
|
|
||||||
`config.toml` and `contexts.toml` live without restarting the daemon or reloading the
|
|
||||||
Whisper model. The **`system`** namespace gives daemon-control by voice without touching
|
|
||||||
Claude: `system status` (mode / target / model / context count) and `system reload
|
|
||||||
[config|contexts]`.
|
|
||||||
|
|
||||||
## Earcons (audio feedback tones)
|
|
||||||
|
|
||||||
Short confirmation tones play on key events so you get **eyes-free feedback** — "did it
|
|
||||||
hear me?" — without watching the terminal. They're tones, not speech (not TTS): a bright
|
|
||||||
blip when a command is accepted/injected, a low buzz when nothing matched, a rising chime
|
|
||||||
on submit, and an optional blip on wake. Tones are short (<300ms) and quiet, and they're
|
|
||||||
**additive** to the console feed — mute them and read at the desk, or hear them eyes-free.
|
|
||||||
|
|
||||||
Verify the audio-OUT path (the reverse of `test-audio`, and the less-tested direction on
|
|
||||||
WSLg) with:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
claudedo test-tone # plays each tone through WSLg — the audio-out gate
|
|
||||||
```
|
|
||||||
|
|
||||||
Tones play through WSLg's PulseAudio sink, **paplay-first** (a separate process, so it
|
|
||||||
doesn't contend with the sounddevice mic stream), falling back to in-process sounddevice,
|
|
||||||
then `powershell.exe` on the Windows host. Playback is **fire-and-forget**: a dead speaker
|
|
||||||
or a missing tone file logs once and is ignored — audio-out can never block or break a
|
|
||||||
command (`claudedo yes` injects whether or not the speaker works).
|
|
||||||
|
|
||||||
Configure under `[sound]`: `enabled` (master, default on), per-event `on_wake` (default
|
|
||||||
**off** — a blip right before you speak can bleed into the command capture, and it's
|
|
||||||
chatty), `on_accept` / `on_no_match` / `on_submit` (default on), and `volume` (0.0–1.0,
|
|
||||||
best-effort — scaled for sounddevice, `--volume` for paplay, ignored by the PowerShell
|
|
||||||
fallback). A `[sound.files]` table can point any event at your own `.wav`. The shipped
|
|
||||||
tones live in the package (`claudedo/sounds/*.wav`); `claudedo/sounds/generate.py` is a
|
|
||||||
synthetic-beep fallback that can regenerate a placeholder set (it does **not** reproduce
|
|
||||||
the shipped tones — running it overwrites them with plain beeps).
|
|
||||||
|
|
||||||
## The confirmed Claude Code keymap
|
## The confirmed Claude Code keymap
|
||||||
|
|
||||||
The keystrokes in [`keys.py`](src/claudedo/keys.py) were confirmed **empirically**
|
The keystrokes in [`keys.py`](src/claudedo/keys.py) were confirmed **empirically**
|
||||||
@ -263,37 +155,11 @@ If Claude Code changes its prompt UI, re-confirm against a live session and upda
|
|||||||
## Config
|
## Config
|
||||||
|
|
||||||
Everything tunable lives in [`config.toml`](config.toml): wake phrases, mode + PTT
|
Everything tunable lives in [`config.toml`](config.toml): wake phrases, mode + PTT
|
||||||
key, Whisper model/language/device, `[vad]` endpointing, and `[behavior]`
|
key, Whisper model/language/device, audio segmentation thresholds, and
|
||||||
(`type_autosend`, fuzzy thresholds, `filler_words`, `auto_target`, `print_heard`).
|
`type_autosend = false`. The default model is `small`; bump to `medium` if the coined
|
||||||
The default model is **`small.en`** (the English-only small model — ~1s/command on a
|
wake word is recognized poorly. `claudedo -c <path> ...` points at a specific config;
|
||||||
strong CPU, more accurate on English than multilingual `small` at the same speed);
|
otherwise it searches `$CLAUDEDO_CONFIG`, `~/.config/claudedo/config.toml`, then
|
||||||
`medium`/`medium.en` are more accurate but ~3× slower (noticeable lag), `base.en` is
|
`./config.toml`.
|
||||||
snappier/less accurate, `large-v3` most accurate/slowest. Every `heard` line shows the
|
|
||||||
STT latency as `(<ms>/<audio>s)` so you can see what a model change costs. VAD
|
|
||||||
endpointing ends a capture after `[vad].silence_ms` (700) of trailing silence, capped
|
|
||||||
at `max_seconds` (15). `claudedo -c <path> ...` points at a specific config; otherwise
|
|
||||||
it searches
|
|
||||||
`$CLAUDEDO_CONFIG`, `~/.config/claudedo/config.toml`, then `./config.toml`.
|
|
||||||
|
|
||||||
- **STT biasing.** The transcriber is seeded with an `initial_prompt` built from the
|
|
||||||
configured wake phrases + command vocabulary (one source — `grammar.vocabulary()`),
|
|
||||||
so Whisper is conditioned to expect "claudedo" and the command words.
|
|
||||||
- **Split fuzzy thresholds.** `wake_fuzzy_threshold` (default `0.65`, lenient) vs
|
|
||||||
`command_fuzzy_threshold` (default `0.8`, tight). The asymmetry is deliberate: a
|
|
||||||
false *wake* is cheap (it wakes, finds no command, does nothing), but a false
|
|
||||||
*command* fires the wrong action. Prefer expanding command synonyms over loosening
|
|
||||||
the command threshold.
|
|
||||||
- **`[vad]` endpointing.** Capture starts on speech and ends after `silence_ms`
|
|
||||||
(default 700) of trailing silence — Alexa-style record-until-pause — capped at
|
|
||||||
`max_seconds` (default 15). The pause both ends a command and separates it from
|
|
||||||
following chatter (the chatter is a separate capture the wake gate discards).
|
|
||||||
- **`auto_target`** (default `false`): with no sticky target and one session running,
|
|
||||||
`false` does nothing and asks you to `set`; `true` auto-uses that session.
|
|
||||||
- **`print_heard`** (default `false`, debug): prints non-wake transcripts so you can
|
|
||||||
see how Whisper renders your wake word, then tune the wake list/threshold.
|
|
||||||
- **`context_multiline`** (default `true`) / **`context_separator`** (default `" — "`):
|
|
||||||
how the `context` command assembles the blurb and instruction — a Shift+Enter soft
|
|
||||||
newline between them, or (when `false`) flattened onto one line with the separator.
|
|
||||||
|
|
||||||
## Requirements
|
## Requirements
|
||||||
|
|
||||||
|
|||||||
90
config.toml
90
config.toml
@ -5,7 +5,7 @@
|
|||||||
# wake phrases for listen mode. fuzzy-matched: case/space-insensitive, lenient on
|
# wake phrases for listen mode. fuzzy-matched: case/space-insensitive, lenient on
|
||||||
# the coined word "claudedo" (whisper renders it inconsistently). number words are
|
# the coined word "claudedo" (whisper renders it inconsistently). number words are
|
||||||
# normalized to digits before command matching.
|
# normalized to digits before command matching.
|
||||||
phrases = ["claudedo", "claude do", "hey claude", "ok claude", "okay claude"]
|
phrases = ["claudedo", "hey claude"]
|
||||||
|
|
||||||
[input]
|
[input]
|
||||||
# "listen" (default): continuous capture; only acts on utterances that start with a
|
# "listen" (default): continuous capture; only acts on utterances that start with a
|
||||||
@ -21,12 +21,10 @@ mode = "listen"
|
|||||||
ptt_key = "space"
|
ptt_key = "space"
|
||||||
|
|
||||||
[stt]
|
[stt]
|
||||||
# faster-whisper model size. "small.en" is the default — the English-only small model
|
# faster-whisper model size. "small" is a good accuracy/latency balance for the
|
||||||
# (~1s/command on a strong cpu, more accurate on english than multilingual "small" at
|
# short command grammar (~sub-second per chunk on a strong cpu). if the coined wake
|
||||||
# the same speed). "medium"/"medium.en" are more accurate but ~3x slower (noticeable
|
# word "claudedo" is recognized poorly, bump to "medium" (slower per chunk).
|
||||||
# lag); "large-v3" is most accurate and slowest. drop to "base.en" for max snappiness
|
model = "small"
|
||||||
# (less accurate). bump only if recognition is poor.
|
|
||||||
model = "small.en"
|
|
||||||
language = "en"
|
language = "en"
|
||||||
# mic device: "auto", or a sounddevice device index (integer) / substring of a
|
# mic device: "auto", or a sounddevice device index (integer) / substring of a
|
||||||
# device name. run `claudedo test-audio` to list devices.
|
# device name. run `claudedo test-audio` to list devices.
|
||||||
@ -38,82 +36,18 @@ compute = "auto"
|
|||||||
# capture parameters. 16 kHz mono is what whisper expects.
|
# capture parameters. 16 kHz mono is what whisper expects.
|
||||||
samplerate = 16000
|
samplerate = 16000
|
||||||
channels = 1
|
channels = 1
|
||||||
# rms energy below this counts as silence (the VAD onset/endpoint floor).
|
# listen-mode silence segmentation: an utterance ends after this many seconds below
|
||||||
|
# the rms threshold. keeps latency low without streaming.
|
||||||
silence_threshold = 0.012
|
silence_threshold = 0.012
|
||||||
|
silence_duration = 0.8
|
||||||
# ignore utterances shorter than this (clicks, coughs).
|
# ignore utterances shorter than this (clicks, coughs).
|
||||||
min_utterance = 0.3
|
min_utterance = 0.3
|
||||||
|
# hard cap on a single utterance so a stuck stream can't grow unbounded.
|
||||||
[vad]
|
max_utterance = 15.0
|
||||||
# Alexa-style record-until-pause endpointing (listen mode). capture starts on speech
|
|
||||||
# onset and ends after this much trailing silence — the natural end of an utterance.
|
|
||||||
# a real pause both ends the command AND separates it from following chatter (the
|
|
||||||
# chatter becomes a separate capture that the wake gate then discards).
|
|
||||||
silence_ms = 700
|
|
||||||
# hard cap so continuous noise can't record forever (also the ceiling for a long
|
|
||||||
# dictated `type` phrase).
|
|
||||||
max_seconds = 15.0
|
|
||||||
|
|
||||||
[behavior]
|
[behavior]
|
||||||
# dictation never auto-submits: "type <phrase>" inserts literal text only; you say
|
# dictation never auto-submits: "type <phrase>" inserts literal text only; you say
|
||||||
# "send" separately to submit (read-before-send).
|
# "send" separately to submit (read-before-send).
|
||||||
type_autosend = false
|
type_autosend = false
|
||||||
# fuzzy match ratios (0..1). the asymmetry is deliberate: a false WAKE is cheap (it
|
# fuzzy match ratio (0..1) required to accept a wake phrase / command token.
|
||||||
# wakes, finds no command, does nothing), so wake is lenient; a false COMMAND fires
|
match_threshold = 0.8
|
||||||
# the WRONG action, so commands stay tight. lower = more lenient = more matches.
|
|
||||||
# prefer expanding command synonyms over loosening command_fuzzy_threshold.
|
|
||||||
wake_fuzzy_threshold = 0.65
|
|
||||||
command_fuzzy_threshold = 0.8
|
|
||||||
# optional filler words that may precede a command and are ignored for matching:
|
|
||||||
# "select yes" / "use yes" behave like "yes". (a filler word followed by a digit is
|
|
||||||
# the select command, e.g. "select 1", and is not dropped.)
|
|
||||||
filler_words = ["select", "use", "choose"]
|
|
||||||
# when no sticky target is set and exactly ONE claude-* session is running:
|
|
||||||
# false (default) -> require an explicit `set <name>` or one-shot `target <name>`;
|
|
||||||
# a bare command does nothing and tells you to set one.
|
|
||||||
# true -> auto-target that single session (convenience).
|
|
||||||
auto_target = false
|
|
||||||
# DEBUG ONLY — relaxes the privacy invariant. when true, the daemon console prints
|
|
||||||
# the raw transcript of EVERY utterance, including non-wake speech it would otherwise
|
|
||||||
# drop silently (shown as `heard (dropped): "<transcript>"`). use it to see exactly
|
|
||||||
# how Whisper renders your wake word, then turn it OFF. default false: non-wake speech
|
|
||||||
# is discarded without ever printing the transcript.
|
|
||||||
print_heard = false
|
|
||||||
|
|
||||||
# how the `context <name> <dictation>` command assembles the blurb + instruction.
|
|
||||||
# true (default): blurb, a soft newline (Shift+Enter — needs the extended-keys tmux
|
|
||||||
# settings install.sh appends), then the instruction. if Shift+Enter is at all flaky
|
|
||||||
# in your terminal (it submits or does nothing), set false to flatten onto one line
|
|
||||||
# with context_separator between blurb and instruction — the blank line is cosmetic,
|
|
||||||
# not worth a submit risk. either way the assembled text is NEVER auto-submitted.
|
|
||||||
context_multiline = true
|
|
||||||
# separator inserted between blurb and instruction when context_multiline = false.
|
|
||||||
context_separator = " — "
|
|
||||||
# the `cleanup` / `detached` command kills DETACHED claude-* sessions only (never an
|
|
||||||
# attached one — a misheard cleanup can't nuke the active session). default false:
|
|
||||||
# kill immediately (it's detached-only, so it's safe). set true to announce the
|
|
||||||
# detached set and wait for a following `confirm` before killing.
|
|
||||||
cleanup_confirm = false
|
|
||||||
|
|
||||||
[sound]
|
|
||||||
# earcons — short confirmation tones on daemon events so you get eyes-free feedback
|
|
||||||
# ("did it hear me?") without watching the terminal. tones are SHORT (<300ms) and quiet;
|
|
||||||
# they play OUT through WSLg's PulseAudio sink (paplay-first, sounddevice fallback, then
|
|
||||||
# powershell.exe). additive to the console feed — mute these and read at the desk, or
|
|
||||||
# hear them eyes-free. a dead speaker never blocks/breaks a command (fire-and-forget).
|
|
||||||
enabled = true
|
|
||||||
# blip when a wake phrase is recognized. OFF by default: a blip right before you speak
|
|
||||||
# the command can bleed into its capture, and it's chatty. turn on only if you want it.
|
|
||||||
on_wake = false
|
|
||||||
# positive blip when a command is recognized/injected.
|
|
||||||
on_accept = true
|
|
||||||
# distinct lower buzz when nothing matched or the target was missing (did nothing).
|
|
||||||
on_no_match = true
|
|
||||||
# rising chime when a send/submit is injected.
|
|
||||||
on_submit = true
|
|
||||||
# best-effort 0.0-1.0 (scaled for sounddevice, --volume for paplay; ignored by the
|
|
||||||
# powershell fallback, which has no volume control).
|
|
||||||
volume = 0.5
|
|
||||||
# optional per-event overrides to swap in your own .wav files, e.g.:
|
|
||||||
# [sound.files]
|
|
||||||
# accept = "~/sounds/my_accept.wav"
|
|
||||||
[sound.files]
|
|
||||||
|
|||||||
@ -1,18 +0,0 @@
|
|||||||
# claudedo contexts — named reference blurbs you can inject ahead of a dictated
|
|
||||||
# instruction with the `context <name> <instruction>` voice command (alias `prepare`).
|
|
||||||
#
|
|
||||||
# the named blurb is injected as a preamble, then your dictated instruction, and the
|
|
||||||
# daemon WAITS — nothing is auto-submitted. you say "send" to submit (read-before-send;
|
|
||||||
# claude's own permission prompt is the backstop for anything consequential).
|
|
||||||
#
|
|
||||||
# names are SPOKEN and fuzzy-matched, so keep them simple, distinct, single words
|
|
||||||
# (a-z, 0-9; spaces/hyphens/underscores are stripped for matching, so "web hooks",
|
|
||||||
# "web-hooks" and "webhooks" all resolve the same block). values are free-form text.
|
|
||||||
#
|
|
||||||
# edit this file, then say "reload" (or run `claudedo reload`) — no daemon restart,
|
|
||||||
# the whisper model is not reloaded.
|
|
||||||
|
|
||||||
[contexts]
|
|
||||||
webhooks = "discord webhooks — test: <url> (safe to spam), live: <url> (real, careful)"
|
|
||||||
testing = "use the test/staging resources only, never touch prod"
|
|
||||||
discord = "discord.py 2.x; bot token in .env as BOT_TOKEN; guild id 12345"
|
|
||||||
29
install.sh
29
install.sh
@ -57,14 +57,9 @@ say "verifying audio path"
|
|||||||
if pactl info >/dev/null 2>&1; then
|
if pactl info >/dev/null 2>&1; then
|
||||||
DEFAULT_SRC="$(pactl info | sed -n 's/^Default Source: //p')"
|
DEFAULT_SRC="$(pactl info | sed -n 's/^Default Source: //p')"
|
||||||
echo " Default Source: ${DEFAULT_SRC:-<none>}"
|
echo " Default Source: ${DEFAULT_SRC:-<none>}"
|
||||||
DEFAULT_SINK="$(pactl info | sed -n 's/^Default Sink: //p')"
|
|
||||||
echo " Default Sink: ${DEFAULT_SINK:-<none>}"
|
|
||||||
if ! pactl list sources short 2>/dev/null | grep -q RDPSource; then
|
if ! pactl list sources short 2>/dev/null | grep -q RDPSource; then
|
||||||
warn "RDPSource not listed by pactl — mic may not be bridged. check Windows mic permission."
|
warn "RDPSource not listed by pactl — mic may not be bridged. check Windows mic permission."
|
||||||
fi
|
fi
|
||||||
if ! pactl list sinks short 2>/dev/null | grep -q RDPSink; then
|
|
||||||
warn "RDPSink not listed by pactl — earcon/TTS audio-OUT may not play. run 'claudedo test-tone' to check."
|
|
||||||
fi
|
|
||||||
else
|
else
|
||||||
warn "pactl info failed — pulseaudio-utils installed but no server reachable yet."
|
warn "pactl info failed — pulseaudio-utils installed but no server reachable yet."
|
||||||
fi
|
fi
|
||||||
@ -99,30 +94,6 @@ mkdir -p "$CONF_DIR"
|
|||||||
install -m 0644 "$REPO_DIR/shell/cc.sh" "$CONF_DIR/cc.sh"
|
install -m 0644 "$REPO_DIR/shell/cc.sh" "$CONF_DIR/cc.sh"
|
||||||
echo " wrote $CONF_DIR/cc.sh"
|
echo " wrote $CONF_DIR/cc.sh"
|
||||||
|
|
||||||
# install config.toml to the standard location so the daemon finds it from any dir.
|
|
||||||
# never clobber an edited user config: copy only if absent, else drop a .new to diff.
|
|
||||||
if [ ! -f "$CONF_DIR/config.toml" ]; then
|
|
||||||
install -m 0644 "$REPO_DIR/config.toml" "$CONF_DIR/config.toml"
|
|
||||||
echo " wrote $CONF_DIR/config.toml"
|
|
||||||
elif ! cmp -s "$REPO_DIR/config.toml" "$CONF_DIR/config.toml"; then
|
|
||||||
install -m 0644 "$REPO_DIR/config.toml" "$CONF_DIR/config.toml.new"
|
|
||||||
echo " kept your $CONF_DIR/config.toml; new default written to config.toml.new (diff to merge)"
|
|
||||||
else
|
|
||||||
echo " $CONF_DIR/config.toml already current"
|
|
||||||
fi
|
|
||||||
|
|
||||||
# install the contexts.toml template (named blurbs for the `context` voice command).
|
|
||||||
# same policy: copy only if absent, else drop a .new — never clobber edited contexts.
|
|
||||||
if [ ! -f "$CONF_DIR/contexts.toml" ]; then
|
|
||||||
install -m 0644 "$REPO_DIR/contexts.toml" "$CONF_DIR/contexts.toml"
|
|
||||||
echo " wrote $CONF_DIR/contexts.toml"
|
|
||||||
elif ! cmp -s "$REPO_DIR/contexts.toml" "$CONF_DIR/contexts.toml"; then
|
|
||||||
install -m 0644 "$REPO_DIR/contexts.toml" "$CONF_DIR/contexts.toml.new"
|
|
||||||
echo " kept your $CONF_DIR/contexts.toml; new default written to contexts.toml.new (diff to merge)"
|
|
||||||
else
|
|
||||||
echo " $CONF_DIR/contexts.toml already current"
|
|
||||||
fi
|
|
||||||
|
|
||||||
# wire EVERY rc that exists (the user may have both zsh and bash).
|
# wire EVERY rc that exists (the user may have both zsh and bash).
|
||||||
wired_any=0
|
wired_any=0
|
||||||
for RC in "$HOME/.zshrc" "$HOME/.bashrc"; do
|
for RC in "$HOME/.zshrc" "$HOME/.bashrc"; do
|
||||||
|
|||||||
@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "claudedo"
|
name = "claudedo"
|
||||||
version = "0.2.2"
|
version = "0.1.0"
|
||||||
description = "voice-control daemon for claude code (local STT -> tmux send-keys)"
|
description = "voice-control daemon for claude code (local STT -> tmux send-keys)"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
requires-python = ">=3.10"
|
requires-python = ">=3.10"
|
||||||
@ -23,9 +23,6 @@ claudedo = "claudedo.__main__:main"
|
|||||||
[tool.setuptools]
|
[tool.setuptools]
|
||||||
package-dir = { "" = "src" }
|
package-dir = { "" = "src" }
|
||||||
|
|
||||||
[tool.setuptools.package-data]
|
|
||||||
"claudedo.sounds" = ["*.wav"]
|
|
||||||
|
|
||||||
[tool.setuptools.packages.find]
|
[tool.setuptools.packages.find]
|
||||||
where = ["src"]
|
where = ["src"]
|
||||||
|
|
||||||
|
|||||||
31
shell/cc.sh
31
shell/cc.sh
@ -11,8 +11,7 @@
|
|||||||
# ccr <name> reattach only (error if it doesn't exist); writes ~/.claude-active
|
# ccr <name> reattach only (error if it doesn't exist); writes ~/.claude-active
|
||||||
# ccl list running claude- sessions
|
# ccl list running claude- sessions
|
||||||
# cck <name> kill claude-<name>
|
# cck <name> kill claude-<name>
|
||||||
# ccclean kill DETACHED claude- sessions only (never attached) — safe cleanup
|
# cckl kill ALL claude- sessions
|
||||||
# cckl kill ALL claude- sessions (including attached)
|
|
||||||
|
|
||||||
cc() {
|
cc() {
|
||||||
if [ -z "$1" ]; then
|
if [ -z "$1" ]; then
|
||||||
@ -61,34 +60,6 @@ cck() {
|
|||||||
fi
|
fi
|
||||||
}
|
}
|
||||||
|
|
||||||
ccclean() {
|
|
||||||
killed=""
|
|
||||||
kept=""
|
|
||||||
while read -r name attached; do
|
|
||||||
case "$name" in
|
|
||||||
claude-*) ;;
|
|
||||||
*) continue ;;
|
|
||||||
esac
|
|
||||||
if [ "$attached" = "0" ]; then
|
|
||||||
if tmux kill-session -t "$name" 2>/dev/null; then
|
|
||||||
killed="${killed:+$killed, }$name"
|
|
||||||
fi
|
|
||||||
else
|
|
||||||
kept="${kept:+$kept, }$name"
|
|
||||||
fi
|
|
||||||
done <<EOF
|
|
||||||
$(tmux list-sessions -F '#{session_name} #{session_attached}' 2>/dev/null)
|
|
||||||
EOF
|
|
||||||
if [ -z "$killed" ]; then
|
|
||||||
echo "nothing to clean (no detached sessions)"
|
|
||||||
else
|
|
||||||
n=$(printf '%s' "$killed" | awk -F', ' '{print NF}')
|
|
||||||
msg="killed $killed ($n detached)"
|
|
||||||
[ -n "$kept" ] && msg="$msg; kept $kept (attached)"
|
|
||||||
echo "$msg"
|
|
||||||
fi
|
|
||||||
}
|
|
||||||
|
|
||||||
cckl() {
|
cckl() {
|
||||||
tmux ls 2>/dev/null | grep '^claude-' | cut -d: -f1 | while read -r s; do
|
tmux ls 2>/dev/null | grep '^claude-' | cut -d: -f1 | while read -r s; do
|
||||||
tmux kill-session -t "$s" && echo "killed $s"
|
tmux kill-session -t "$s" && echo "killed $s"
|
||||||
|
|||||||
@ -1,3 +1,3 @@
|
|||||||
"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)"""
|
"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)"""
|
||||||
|
|
||||||
__version__ = "0.2.2"
|
__version__ = "0.1.0"
|
||||||
|
|||||||
@ -33,12 +33,12 @@ def cmd_start(args: argparse.Namespace) -> int:
|
|||||||
config = _load_or_die(args.config)
|
config = _load_or_die(args.config)
|
||||||
if args.mode:
|
if args.mode:
|
||||||
config.mode = args.mode
|
config.mode = args.mode
|
||||||
if args.check:
|
if not args.skip_audio_check:
|
||||||
print("checking mic before listening (speak briefly) ...")
|
print("checking mic before listening (speak briefly) ...")
|
||||||
peak = _probe_mic(config, seconds=2.0, verbose=False)
|
peak = _probe_mic(config, seconds=2.0, verbose=False)
|
||||||
if peak is None or peak < 0.02:
|
if peak is None or peak < 0.02:
|
||||||
print("mic check failed — no usable input.", file=sys.stderr)
|
print("mic check failed — no usable input.", file=sys.stderr)
|
||||||
print("run `claudedo test-audio` to debug, or `claudedo start` to skip the check",
|
print("run `claudedo test-audio` to debug; or `claudedo start --skip-audio-check`",
|
||||||
file=sys.stderr)
|
file=sys.stderr)
|
||||||
return 1
|
return 1
|
||||||
print(f"mic OK (peak {peak:.3f}).")
|
print(f"mic OK (peak {peak:.3f}).")
|
||||||
@ -97,56 +97,6 @@ def cmd_stop(_args: argparse.Namespace) -> int:
|
|||||||
return 1
|
return 1
|
||||||
|
|
||||||
|
|
||||||
def cmd_test_tone(args: argparse.Namespace) -> int:
|
|
||||||
config = _load_or_die(args.config)
|
|
||||||
from . import audio_out, sound
|
|
||||||
|
|
||||||
print("== claudedo test-tone ==")
|
|
||||||
if not audio_out.available():
|
|
||||||
print("no audio-out backend found (paplay / powershell.exe).", file=sys.stderr)
|
|
||||||
print("install pulseaudio-utils (run install.sh) for paplay.", file=sys.stderr)
|
|
||||||
return 1
|
|
||||||
earcons = sound.Earcons(config)
|
|
||||||
print(f"playing each tone via WSLg audio-out (volume {config.sound_volume}) — listen ...")
|
|
||||||
ok = True
|
|
||||||
for event in sound.event_names():
|
|
||||||
path = earcons.tone_path(event)
|
|
||||||
if path is None or not Path(path).is_file():
|
|
||||||
print(f" {event:9} MISSING ({path})")
|
|
||||||
ok = False
|
|
||||||
continue
|
|
||||||
print(f" {event:9} {path.name} ...", flush=True)
|
|
||||||
played = audio_out.play_blocking(path, volume=config.sound_volume)
|
|
||||||
if not played:
|
|
||||||
print(f" {event:9} FAILED to play", file=sys.stderr)
|
|
||||||
ok = False
|
|
||||||
if not ok:
|
|
||||||
print("some tones did not play — audio-out may be unavailable.", file=sys.stderr)
|
|
||||||
return 1
|
|
||||||
print("audio-out OK (all tones played).")
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_cleanup(_args: argparse.Namespace) -> int:
|
|
||||||
killed, kept = target.cleanup_detached()
|
|
||||||
if not killed:
|
|
||||||
print("nothing to clean (no detached sessions)")
|
|
||||||
return 0
|
|
||||||
msg = f"killed {', '.join(killed)}"
|
|
||||||
if kept:
|
|
||||||
msg += f"; kept {', '.join(kept)} (attached)"
|
|
||||||
print(msg)
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_reload(_args: argparse.Namespace) -> int:
|
|
||||||
if daemon.reload_running():
|
|
||||||
print("signalled claudedo to reload config + contexts")
|
|
||||||
return 0
|
|
||||||
print("claudedo is not running")
|
|
||||||
return 1
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_status(_args: argparse.Namespace) -> int:
|
def cmd_status(_args: argparse.Namespace) -> int:
|
||||||
pid = daemon.read_pid()
|
pid = daemon.read_pid()
|
||||||
if pid is None:
|
if pid is None:
|
||||||
@ -235,26 +185,9 @@ def cmd_install(_args: argparse.Namespace) -> int:
|
|||||||
return subprocess.call(["bash", str(script)])
|
return subprocess.call(["bash", str(script)])
|
||||||
|
|
||||||
|
|
||||||
def cmd_set(args: argparse.Namespace) -> int:
|
def cmd_switch(args: argparse.Namespace) -> int:
|
||||||
session = target.set_target(args.name)
|
session = target.set_target(args.name)
|
||||||
print(f"sticky target -> {session}")
|
print(f"target -> {session}")
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_unset(_args: argparse.Namespace) -> int:
|
|
||||||
target.unset_target()
|
|
||||||
print("sticky target cleared")
|
|
||||||
return 0
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_list(_args: argparse.Namespace) -> int:
|
|
||||||
sessions = target.list_sessions()
|
|
||||||
if not sessions:
|
|
||||||
print("no claude sessions running")
|
|
||||||
return 1
|
|
||||||
active = target.read_active()
|
|
||||||
for s in sessions:
|
|
||||||
print(f"{'* ' if s == active else ' '}{s}")
|
|
||||||
return 0
|
return 0
|
||||||
|
|
||||||
|
|
||||||
@ -267,27 +200,18 @@ def build_parser() -> argparse.ArgumentParser:
|
|||||||
|
|
||||||
sp = sub.add_parser("start", help="run the daemon (foreground)")
|
sp = sub.add_parser("start", help="run the daemon (foreground)")
|
||||||
sp.add_argument("--mode", choices=("listen", "ptt"), help="override input mode")
|
sp.add_argument("--mode", choices=("listen", "ptt"), help="override input mode")
|
||||||
sp.add_argument("--check", action="store_true",
|
sp.add_argument("--skip-audio-check", action="store_true",
|
||||||
help="run a mic check before listening (off by default)")
|
help="skip the pre-listen mic check")
|
||||||
sp.set_defaults(func=cmd_start)
|
sp.set_defaults(func=cmd_start)
|
||||||
|
|
||||||
sub.add_parser("stop", help="stop a running daemon").set_defaults(func=cmd_stop)
|
sub.add_parser("stop", help="stop a running daemon").set_defaults(func=cmd_stop)
|
||||||
sub.add_parser("reload", help="reload config + contexts in a running daemon"
|
|
||||||
).set_defaults(func=cmd_reload)
|
|
||||||
sub.add_parser("status", help="show daemon status").set_defaults(func=cmd_status)
|
sub.add_parser("status", help="show daemon status").set_defaults(func=cmd_status)
|
||||||
sub.add_parser("test-audio", help="verify the mic capture path").set_defaults(func=cmd_test_audio)
|
sub.add_parser("test-audio", help="verify the mic capture path").set_defaults(func=cmd_test_audio)
|
||||||
sub.add_parser("test-tone", help="play each earcon (verify the audio-out path)"
|
|
||||||
).set_defaults(func=cmd_test_tone)
|
|
||||||
sub.add_parser("install", help="re-run the bootstrap (install.sh)").set_defaults(func=cmd_install)
|
sub.add_parser("install", help="re-run the bootstrap (install.sh)").set_defaults(func=cmd_install)
|
||||||
sub.add_parser("unset", help="clear the sticky target session").set_defaults(func=cmd_unset)
|
|
||||||
sub.add_parser("list", help="list running claude-* sessions").set_defaults(func=cmd_list)
|
|
||||||
sub.add_parser("cleanup", help="kill detached claude-* sessions (never attached)"
|
|
||||||
).set_defaults(func=cmd_cleanup)
|
|
||||||
|
|
||||||
for verb in ("set", "switch"):
|
sw = sub.add_parser("switch", help="set the active target session")
|
||||||
sp_set = sub.add_parser(verb, help="set the sticky target session")
|
sw.add_argument("name", help="project short-name (claude- prefix optional)")
|
||||||
sp_set.add_argument("name", help="project short-name (claude- prefix optional)")
|
sw.set_defaults(func=cmd_switch)
|
||||||
sp_set.set_defaults(func=cmd_set)
|
|
||||||
return p
|
return p
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -1,158 +0,0 @@
|
|||||||
"""audio output — play short .wav files through the WSLg/PulseAudio sink (RDPSink).
|
|
||||||
|
|
||||||
the reverse direction of audio.py's mic capture, and the less-tested path on WSLg. a
|
|
||||||
three-tier player picks the first backend that works and remembers it:
|
|
||||||
|
|
||||||
1. paplay (pulseaudio-utils) — a SEPARATE process hitting PulseAudio directly. this
|
|
||||||
is the primary on purpose: the daemon captures with sounddevice (an open input
|
|
||||||
stream in listen mode), so keeping OUTPUT in a separate process avoids stacking
|
|
||||||
input+output in one lib on a bridge known to be duplex-flaky.
|
|
||||||
2. sounddevice sd.play() — in-process fallback if paplay is absent.
|
|
||||||
3. powershell.exe SoundPlayer — last resort via the Windows host (no volume control).
|
|
||||||
|
|
||||||
both earcons (sound.py) and future v0.3 TTS readback play through this module — keep it
|
|
||||||
generic (it plays a wav path, it knows nothing about events). playback is fire-and-
|
|
||||||
forget on a worker thread: a missing file or a dead speaker logs once and is swallowed,
|
|
||||||
never raised, so audio-out can NEVER block or break the inject path.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
import shutil
|
|
||||||
import subprocess
|
|
||||||
import threading
|
|
||||||
import wave
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
_PAPLAY = "paplay"
|
|
||||||
_POWERSHELL = "powershell.exe"
|
|
||||||
|
|
||||||
_backend_lock = threading.Lock()
|
|
||||||
_chosen_backend: str | None = None
|
|
||||||
_warned = False
|
|
||||||
|
|
||||||
|
|
||||||
def _have(cmd: str) -> bool:
|
|
||||||
return shutil.which(cmd) is not None
|
|
||||||
|
|
||||||
|
|
||||||
def _clamp_volume(volume: float) -> float:
|
|
||||||
return max(0.0, min(1.0, float(volume)))
|
|
||||||
|
|
||||||
|
|
||||||
def _play_paplay(path: Path, volume: float) -> bool:
|
|
||||||
"""play via paplay; volume scaled through --volume (0-65536 linear)"""
|
|
||||||
vol = int(_clamp_volume(volume) * 65536)
|
|
||||||
proc = subprocess.run(
|
|
||||||
[_PAPLAY, f"--volume={vol}", str(path)],
|
|
||||||
stdout=subprocess.DEVNULL, stderr=subprocess.PIPE,
|
|
||||||
)
|
|
||||||
if proc.returncode != 0:
|
|
||||||
log.debug("paplay failed: %s", proc.stderr.decode("utf-8", "replace").strip())
|
|
||||||
return False
|
|
||||||
return True
|
|
||||||
|
|
||||||
|
|
||||||
def _play_sounddevice(path: Path, volume: float) -> bool:
|
|
||||||
"""play via sounddevice (in-process fallback); volume scales the samples"""
|
|
||||||
try:
|
|
||||||
import numpy as np
|
|
||||||
import sounddevice as sd
|
|
||||||
except Exception as exc:
|
|
||||||
log.debug("sounddevice unavailable: %s", exc)
|
|
||||||
return False
|
|
||||||
try:
|
|
||||||
with wave.open(str(path), "rb") as wf:
|
|
||||||
sr = wf.getframerate()
|
|
||||||
frames = wf.readframes(wf.getnframes())
|
|
||||||
data = np.frombuffer(frames, dtype="<i2").astype(np.float32) / 32768.0
|
|
||||||
data = data * _clamp_volume(volume)
|
|
||||||
sd.play(data, sr)
|
|
||||||
sd.wait()
|
|
||||||
return True
|
|
||||||
except Exception as exc:
|
|
||||||
log.debug("sounddevice playback failed: %s", exc)
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def _play_powershell(path: Path, _volume: float) -> bool:
|
|
||||||
"""play via the Windows host (last resort). SoundPlayer has no volume control,
|
|
||||||
so volume is ignored on this backend (documented best-effort)."""
|
|
||||||
if not _have(_POWERSHELL):
|
|
||||||
return False
|
|
||||||
try:
|
|
||||||
win = subprocess.run(["wslpath", "-w", str(path)], stdout=subprocess.PIPE,
|
|
||||||
stderr=subprocess.DEVNULL)
|
|
||||||
winpath = win.stdout.decode("utf-8", "replace").strip() if win.returncode == 0 else str(path)
|
|
||||||
script = f"(New-Object Media.SoundPlayer '{winpath}').PlaySync()"
|
|
||||||
proc = subprocess.run([_POWERSHELL, "-NoProfile", "-Command", script],
|
|
||||||
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
|
|
||||||
return proc.returncode == 0
|
|
||||||
except Exception as exc:
|
|
||||||
log.debug("powershell playback failed: %s", exc)
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
_BACKENDS = {
|
|
||||||
"paplay": _play_paplay,
|
|
||||||
"sounddevice": _play_sounddevice,
|
|
||||||
"powershell": _play_powershell,
|
|
||||||
}
|
|
||||||
_ORDER = ("paplay", "sounddevice", "powershell")
|
|
||||||
|
|
||||||
|
|
||||||
def _play_sync(path: Path, volume: float) -> bool:
|
|
||||||
"""play a wav synchronously, choosing/remembering a working backend. returns
|
|
||||||
whether playback succeeded; never raises."""
|
|
||||||
global _chosen_backend, _warned
|
|
||||||
if not path.is_file():
|
|
||||||
log.debug("tone file missing: %s", path)
|
|
||||||
return False
|
|
||||||
|
|
||||||
with _backend_lock:
|
|
||||||
order = (_chosen_backend,) + _ORDER if _chosen_backend else _ORDER
|
|
||||||
tried = []
|
|
||||||
for name in order:
|
|
||||||
if name in tried:
|
|
||||||
continue
|
|
||||||
tried.append(name)
|
|
||||||
if name == "paplay" and not _have(_PAPLAY):
|
|
||||||
continue
|
|
||||||
if _BACKENDS[name](path, volume):
|
|
||||||
with _backend_lock:
|
|
||||||
_chosen_backend = name
|
|
||||||
return True
|
|
||||||
|
|
||||||
with _backend_lock:
|
|
||||||
if not _warned:
|
|
||||||
_warned = True
|
|
||||||
log.warning("audio-out unavailable (tried %s) — continuing silently; "
|
|
||||||
"tones disabled for this run", ", ".join(tried))
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def play(path: str | Path, volume: float = 1.0, blocking: bool = False) -> None:
|
|
||||||
"""play a wav file. fire-and-forget by default (a worker thread), so a slow or
|
|
||||||
dead speaker never delays the caller. set blocking=True only for test-tone, where
|
|
||||||
we want to play tones in sequence and report the result.
|
|
||||||
|
|
||||||
failures are swallowed (logged once) — audio-out must never break a command.
|
|
||||||
"""
|
|
||||||
p = Path(path)
|
|
||||||
if blocking:
|
|
||||||
_play_sync(p, volume)
|
|
||||||
return
|
|
||||||
threading.Thread(target=_play_sync, args=(p, volume), daemon=True).start()
|
|
||||||
|
|
||||||
|
|
||||||
def play_blocking(path: str | Path, volume: float = 1.0) -> bool:
|
|
||||||
"""synchronous play that returns success — for test-tone's audio-out gate"""
|
|
||||||
return _play_sync(Path(path), volume)
|
|
||||||
|
|
||||||
|
|
||||||
def available() -> bool:
|
|
||||||
"""true if any audio-out backend is present (best-effort, paplay/powershell)"""
|
|
||||||
return _have(_PAPLAY) or _have(_POWERSHELL)
|
|
||||||
@ -17,10 +17,7 @@ except ModuleNotFoundError:
|
|||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
_VALID_MODES = ("listen", "ptt")
|
_VALID_MODES = ("listen", "ptt")
|
||||||
_VALID_MODELS = (
|
_VALID_MODELS = ("tiny", "base", "small", "medium", "large-v2", "large-v3")
|
||||||
"tiny", "base", "small", "medium", "large-v1", "large-v2", "large-v3",
|
|
||||||
"tiny.en", "base.en", "small.en", "medium.en",
|
|
||||||
)
|
|
||||||
|
|
||||||
DEFAULT_CONFIG_PATHS = (
|
DEFAULT_CONFIG_PATHS = (
|
||||||
Path(os.environ.get("CLAUDEDO_CONFIG", "")) if os.environ.get("CLAUDEDO_CONFIG") else None,
|
Path(os.environ.get("CLAUDEDO_CONFIG", "")) if os.environ.get("CLAUDEDO_CONFIG") else None,
|
||||||
@ -47,25 +44,11 @@ class Config:
|
|||||||
samplerate: int
|
samplerate: int
|
||||||
channels: int
|
channels: int
|
||||||
silence_threshold: float
|
silence_threshold: float
|
||||||
vad_silence_ms: int
|
silence_duration: float
|
||||||
vad_max_seconds: float
|
|
||||||
min_utterance: float
|
min_utterance: float
|
||||||
|
max_utterance: float
|
||||||
type_autosend: bool
|
type_autosend: bool
|
||||||
wake_fuzzy_threshold: float
|
match_threshold: float
|
||||||
command_fuzzy_threshold: float
|
|
||||||
filler_words: tuple[str, ...]
|
|
||||||
auto_target: bool
|
|
||||||
print_heard: bool
|
|
||||||
context_multiline: bool
|
|
||||||
context_separator: str
|
|
||||||
cleanup_confirm: bool
|
|
||||||
sound_enabled: bool
|
|
||||||
sound_on_wake: bool
|
|
||||||
sound_on_accept: bool
|
|
||||||
sound_on_no_match: bool
|
|
||||||
sound_on_submit: bool
|
|
||||||
sound_volume: float
|
|
||||||
sound_files: dict[str, str]
|
|
||||||
source_path: Path | None = field(default=None)
|
source_path: Path | None = field(default=None)
|
||||||
|
|
||||||
|
|
||||||
@ -112,7 +95,7 @@ def load_config(explicit: str | os.PathLike | None = None) -> Config:
|
|||||||
if mode not in _VALID_MODES:
|
if mode not in _VALID_MODES:
|
||||||
raise ConfigError(f"[input].mode must be one of {_VALID_MODES}, got {mode!r}")
|
raise ConfigError(f"[input].mode must be one of {_VALID_MODES}, got {mode!r}")
|
||||||
|
|
||||||
model = _require(raw, "stt", "model", (str,), "small.en")
|
model = _require(raw, "stt", "model", (str,), "small")
|
||||||
if model not in _VALID_MODELS:
|
if model not in _VALID_MODELS:
|
||||||
log.warning("unknown stt model %r — passing through to faster-whisper", model)
|
log.warning("unknown stt model %r — passing through to faster-whisper", model)
|
||||||
|
|
||||||
@ -127,37 +110,15 @@ def load_config(explicit: str | os.PathLike | None = None) -> Config:
|
|||||||
samplerate=int(_require(raw, "audio", "samplerate", (int,), 16000)),
|
samplerate=int(_require(raw, "audio", "samplerate", (int,), 16000)),
|
||||||
channels=int(_require(raw, "audio", "channels", (int,), 1)),
|
channels=int(_require(raw, "audio", "channels", (int,), 1)),
|
||||||
silence_threshold=float(_require(raw, "audio", "silence_threshold", (int, float), 0.012)),
|
silence_threshold=float(_require(raw, "audio", "silence_threshold", (int, float), 0.012)),
|
||||||
vad_silence_ms=int(_require(raw, "vad", "silence_ms", (int,), 700)),
|
silence_duration=float(_require(raw, "audio", "silence_duration", (int, float), 0.8)),
|
||||||
vad_max_seconds=float(_require(raw, "vad", "max_seconds", (int, float), 15.0)),
|
|
||||||
min_utterance=float(_require(raw, "audio", "min_utterance", (int, float), 0.3)),
|
min_utterance=float(_require(raw, "audio", "min_utterance", (int, float), 0.3)),
|
||||||
|
max_utterance=float(_require(raw, "audio", "max_utterance", (int, float), 15.0)),
|
||||||
type_autosend=bool(_require(raw, "behavior", "type_autosend", (bool,), False)),
|
type_autosend=bool(_require(raw, "behavior", "type_autosend", (bool,), False)),
|
||||||
wake_fuzzy_threshold=float(_require(raw, "behavior", "wake_fuzzy_threshold", (int, float), 0.65)),
|
match_threshold=float(_require(raw, "behavior", "match_threshold", (int, float), 0.8)),
|
||||||
command_fuzzy_threshold=float(_require(raw, "behavior", "command_fuzzy_threshold",
|
|
||||||
(int, float), 0.8)),
|
|
||||||
filler_words=tuple(_require(raw, "behavior", "filler_words", (list,),
|
|
||||||
["select", "use", "choose"])),
|
|
||||||
auto_target=bool(_require(raw, "behavior", "auto_target", (bool,), False)),
|
|
||||||
print_heard=bool(_require(raw, "behavior", "print_heard", (bool,), False)),
|
|
||||||
context_multiline=bool(_require(raw, "behavior", "context_multiline", (bool,), True)),
|
|
||||||
context_separator=str(_require(raw, "behavior", "context_separator", (str,), " — ")),
|
|
||||||
cleanup_confirm=bool(_require(raw, "behavior", "cleanup_confirm", (bool,), False)),
|
|
||||||
sound_enabled=bool(_require(raw, "sound", "enabled", (bool,), True)),
|
|
||||||
sound_on_wake=bool(_require(raw, "sound", "on_wake", (bool,), False)),
|
|
||||||
sound_on_accept=bool(_require(raw, "sound", "on_accept", (bool,), True)),
|
|
||||||
sound_on_no_match=bool(_require(raw, "sound", "on_no_match", (bool,), True)),
|
|
||||||
sound_on_submit=bool(_require(raw, "sound", "on_submit", (bool,), True)),
|
|
||||||
sound_volume=float(_require(raw, "sound", "volume", (int, float), 0.5)),
|
|
||||||
sound_files=dict(_require(raw, "sound", "files", (dict,), {})),
|
|
||||||
source_path=path,
|
source_path=path,
|
||||||
)
|
)
|
||||||
for label, val in (("wake_fuzzy_threshold", cfg.wake_fuzzy_threshold),
|
if not 0.0 < cfg.match_threshold <= 1.0:
|
||||||
("command_fuzzy_threshold", cfg.command_fuzzy_threshold)):
|
raise ConfigError("[behavior].match_threshold must be in (0, 1]")
|
||||||
if not 0.0 < val <= 1.0:
|
|
||||||
raise ConfigError(f"[behavior].{label} must be in (0, 1]")
|
|
||||||
if not 0.0 <= cfg.sound_volume <= 1.0:
|
|
||||||
raise ConfigError("[sound].volume must be in [0, 1]")
|
|
||||||
if cfg.vad_silence_ms <= 0 or cfg.vad_max_seconds <= 0:
|
|
||||||
raise ConfigError("[vad].silence_ms and max_seconds must be positive")
|
|
||||||
if cfg.samplerate <= 0 or cfg.channels <= 0:
|
if cfg.samplerate <= 0 or cfg.channels <= 0:
|
||||||
raise ConfigError("[audio].samplerate and channels must be positive")
|
raise ConfigError("[audio].samplerate and channels must be positive")
|
||||||
return cfg
|
return cfg
|
||||||
|
|||||||
@ -1,65 +0,0 @@
|
|||||||
"""colored, prefixed console output for the daemon's recognition/action feed.
|
|
||||||
|
|
||||||
every line is ``HH:MM:SS [PREFIX] message``. prefixes group the source: a session
|
|
||||||
name (e.g. ``[claude-libs]``) for anything injected into a tmux session, ``[SYSTEM]``
|
|
||||||
for daemon-control/state lines, and ``[VOICE]`` for STT/recognition lines. color is
|
|
||||||
opt-in via tty detection (or forced): green for successful injections, red for
|
|
||||||
drops/errors, dim for routine. falls back to plain text when stdout is not a tty.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
|
|
||||||
RESET = "\033[0m"
|
|
||||||
_COLORS = {
|
|
||||||
"green": "\033[32m",
|
|
||||||
"red": "\033[31m",
|
|
||||||
"yellow": "\033[33m",
|
|
||||||
"cyan": "\033[36m",
|
|
||||||
"blue": "\033[34m",
|
|
||||||
"brightblue": "\033[94m",
|
|
||||||
"magenta": "\033[35m",
|
|
||||||
"dim": "\033[2m",
|
|
||||||
"bold": "\033[1m",
|
|
||||||
}
|
|
||||||
|
|
||||||
SYSTEM = "SYSTEM"
|
|
||||||
VOICE = "VOICE"
|
|
||||||
HELP = "HELP"
|
|
||||||
|
|
||||||
|
|
||||||
class Console:
|
|
||||||
"""formats and prints daemon log lines with timestamp, prefix, and color"""
|
|
||||||
|
|
||||||
def __init__(self, color: bool | None = None, stream=None, clock=None) -> None:
|
|
||||||
self.stream = stream if stream is not None else sys.stdout
|
|
||||||
self._clock = clock or time.localtime
|
|
||||||
if color is None:
|
|
||||||
color = hasattr(self.stream, "isatty") and self.stream.isatty()
|
|
||||||
self.color = bool(color)
|
|
||||||
|
|
||||||
def _stamp(self) -> str:
|
|
||||||
t = self._clock()
|
|
||||||
return f"{t.tm_hour:02d}:{t.tm_min:02d}:{t.tm_sec:02d}"
|
|
||||||
|
|
||||||
def _paint(self, text: str, color: str | None) -> str:
|
|
||||||
if not self.color or not color or color not in _COLORS:
|
|
||||||
return text
|
|
||||||
return f"{_COLORS[color]}{text}{RESET}"
|
|
||||||
|
|
||||||
def paint(self, text: str, color: str | None) -> str:
|
|
||||||
"""public colorizer for pre-coloring a fragment of a message (e.g. a command
|
|
||||||
word) before passing it to emit() with color=None"""
|
|
||||||
return self._paint(text, color)
|
|
||||||
|
|
||||||
def emit(self, prefix: str, message: str, color: str | None = None) -> None:
|
|
||||||
"""print one line: ``HH:MM:SS [prefix] message`` (message optionally colored)"""
|
|
||||||
line = f"{self._stamp()} {self._paint(f'[{prefix}]', 'dim')} {self._paint(message, color)}"
|
|
||||||
print(line, file=self.stream, flush=True)
|
|
||||||
|
|
||||||
def line(self, message: str, color: str | None = None) -> None:
|
|
||||||
"""print a bare continuation line (no timestamp/prefix) — for multi-row blocks
|
|
||||||
like the help menu, indented under a preceding header"""
|
|
||||||
print(self._paint(message, color), file=self.stream, flush=True)
|
|
||||||
@ -1,108 +0,0 @@
|
|||||||
"""load named context blocks from contexts.toml into a typed lookup.
|
|
||||||
|
|
||||||
contexts are user-edited reference blurbs (claude.md-style snippets) keyed by simple
|
|
||||||
spoken names. the ``context``/``prepare`` voice command injects a named blurb ahead of
|
|
||||||
a dictated instruction (read-before-send: never auto-submitted). mirrors config.py's
|
|
||||||
load/validate pattern; a missing file is an empty set, not an error.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
import os
|
|
||||||
import re
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
try:
|
|
||||||
import tomllib as _toml
|
|
||||||
except ModuleNotFoundError:
|
|
||||||
import tomli as _toml
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
_NAME_RE = re.compile(r"^[a-z0-9][a-z0-9 _-]*$")
|
|
||||||
|
|
||||||
DEFAULT_CONTEXTS_PATHS = (
|
|
||||||
Path(os.environ.get("CLAUDEDO_CONTEXTS", "")) if os.environ.get("CLAUDEDO_CONTEXTS") else None,
|
|
||||||
Path.home() / ".config" / "claudedo" / "contexts.toml",
|
|
||||||
Path.cwd() / "contexts.toml",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class ContextsError(Exception):
|
|
||||||
"""raised on an unparseable or invalid contexts.toml"""
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class Contexts:
|
|
||||||
"""validated named context blocks (name -> blurb), normalized for spoken lookup"""
|
|
||||||
|
|
||||||
blocks: dict[str, str] = field(default_factory=dict)
|
|
||||||
source_path: Path | None = field(default=None)
|
|
||||||
|
|
||||||
def __len__(self) -> int:
|
|
||||||
return len(self.blocks)
|
|
||||||
|
|
||||||
def names(self) -> list[str]:
|
|
||||||
"""the context names, sorted (for status / listing)"""
|
|
||||||
return sorted(self.blocks)
|
|
||||||
|
|
||||||
def get(self, name: str) -> str | None:
|
|
||||||
"""look up a blurb by its normalized (lowercased, despaced) name, or None.
|
|
||||||
|
|
||||||
names are matched on a lowercase, space/underscore/hyphen-stripped key so a
|
|
||||||
spoken "web hooks" resolves the configured ``webhooks``/``web-hooks`` block.
|
|
||||||
"""
|
|
||||||
return self.blocks.get(_key(name))
|
|
||||||
|
|
||||||
|
|
||||||
def _key(name: str) -> str:
|
|
||||||
return re.sub(r"[ _-]+", "", name.strip().lower())
|
|
||||||
|
|
||||||
|
|
||||||
def find_contexts_path(explicit: str | os.PathLike | None = None) -> Path | None:
|
|
||||||
"""resolve the contexts.toml path, or None if no file exists (not an error)"""
|
|
||||||
candidates: list[Path] = []
|
|
||||||
if explicit:
|
|
||||||
candidates.append(Path(explicit))
|
|
||||||
candidates.extend(p for p in DEFAULT_CONTEXTS_PATHS if p)
|
|
||||||
for path in candidates:
|
|
||||||
if path.is_file():
|
|
||||||
return path
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def load_contexts(explicit: str | os.PathLike | None = None) -> Contexts:
|
|
||||||
"""load contexts.toml from the first existing default path (or an explicit one).
|
|
||||||
|
|
||||||
a missing file yields an empty Contexts (the feature is opt-in). names must be
|
|
||||||
simple words (matchable) and values must be non-empty strings; a bad entry raises
|
|
||||||
ContextsError so the user sees a clear message rather than a silent drop.
|
|
||||||
"""
|
|
||||||
path = find_contexts_path(explicit)
|
|
||||||
if path is None:
|
|
||||||
return Contexts(blocks={}, source_path=None)
|
|
||||||
|
|
||||||
try:
|
|
||||||
with open(path, "rb") as fh:
|
|
||||||
raw = _toml.load(fh)
|
|
||||||
except _toml.TOMLDecodeError as exc:
|
|
||||||
raise ContextsError(f"could not parse {path}: {exc}") from exc
|
|
||||||
|
|
||||||
table = raw.get("contexts", {})
|
|
||||||
if not isinstance(table, dict):
|
|
||||||
raise ContextsError("[contexts] must be a table of name = \"blurb\" entries")
|
|
||||||
|
|
||||||
blocks: dict[str, str] = {}
|
|
||||||
for name, value in table.items():
|
|
||||||
if not isinstance(name, str) or not _NAME_RE.match(name.lower()):
|
|
||||||
raise ContextsError(f"context name {name!r} must be simple words (a-z, 0-9, space/-/_)")
|
|
||||||
if not isinstance(value, str) or not value.strip():
|
|
||||||
raise ContextsError(f"context {name!r} must be a non-empty string")
|
|
||||||
key = _key(name)
|
|
||||||
if key in blocks:
|
|
||||||
raise ContextsError(f"context {name!r} collides with another name on the spoken key {key!r}")
|
|
||||||
blocks[key] = value.strip()
|
|
||||||
|
|
||||||
return Contexts(blocks=blocks, source_path=path)
|
|
||||||
@ -16,11 +16,8 @@ import sys
|
|||||||
import time
|
import time
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from . import __version__, audio, grammar, inject, keys, target
|
from . import audio, grammar, inject, target
|
||||||
from .config import Config, ConfigError, load_config
|
from .config import Config
|
||||||
from .console import HELP, SYSTEM, VOICE, Console
|
|
||||||
from .contexts import Contexts, ContextsError, load_contexts
|
|
||||||
from .sound import Earcons
|
|
||||||
from .stt import Transcriber
|
from .stt import Transcriber
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
@ -78,16 +75,6 @@ def stop_running() -> bool:
|
|||||||
return True
|
return True
|
||||||
|
|
||||||
|
|
||||||
def reload_running() -> bool:
|
|
||||||
"""signal a running daemon (SIGHUP) to reload config + contexts. returns whether
|
|
||||||
one was found. no-op on platforms without SIGHUP."""
|
|
||||||
pid = read_pid()
|
|
||||||
if pid is None or not hasattr(signal, "SIGHUP"):
|
|
||||||
return False
|
|
||||||
os.kill(pid, signal.SIGHUP)
|
|
||||||
return True
|
|
||||||
|
|
||||||
|
|
||||||
class _PTTKey:
|
class _PTTKey:
|
||||||
"""desk-only push-to-talk: 'held' while the configured key is down in the
|
"""desk-only push-to-talk: 'held' while the configured key is down in the
|
||||||
daemon's own terminal. there is deliberately NO global hotkey — a system-wide
|
daemon's own terminal. there is deliberately NO global hotkey — a system-wide
|
||||||
@ -124,36 +111,18 @@ class Daemon:
|
|||||||
self.config = config
|
self.config = config
|
||||||
self.mode = config.mode
|
self.mode = config.mode
|
||||||
self._stop = False
|
self._stop = False
|
||||||
self._reload_pending = False
|
|
||||||
self._cleanup_pending = False
|
|
||||||
self._transcriber: Transcriber | None = None
|
self._transcriber: Transcriber | None = None
|
||||||
self._device: int | None = None
|
self._device: int | None = None
|
||||||
self._ptt = _PTTKey()
|
self._ptt = _PTTKey()
|
||||||
self._pending: dict[str, int] = {}
|
|
||||||
self._console = Console()
|
|
||||||
self._contexts = Contexts()
|
|
||||||
self._earcons = Earcons(config)
|
|
||||||
self._last_stt_ms = 0.0
|
|
||||||
self._last_audio_s = 0.0
|
|
||||||
|
|
||||||
def _install_signals(self) -> None:
|
def _install_signals(self) -> None:
|
||||||
signal.signal(signal.SIGTERM, self._on_signal)
|
signal.signal(signal.SIGTERM, self._on_signal)
|
||||||
signal.signal(signal.SIGINT, self._on_signal)
|
signal.signal(signal.SIGINT, self._on_signal)
|
||||||
if hasattr(signal, "SIGHUP"):
|
|
||||||
signal.signal(signal.SIGHUP, self._on_reload_signal)
|
|
||||||
|
|
||||||
def _on_signal(self, _signum, _frame) -> None:
|
def _on_signal(self, _signum, _frame) -> None:
|
||||||
log.info("stop requested")
|
log.info("stop requested")
|
||||||
self._stop = True
|
self._stop = True
|
||||||
|
|
||||||
def _on_reload_signal(self, _signum, _frame) -> None:
|
|
||||||
"""SIGHUP from `claudedo reload` -> reload both config files on the next tick.
|
|
||||||
|
|
||||||
the actual reload runs in the loop (not the handler) so it never races a
|
|
||||||
capture/transcribe; the handler only sets the flag.
|
|
||||||
"""
|
|
||||||
self._reload_pending = True
|
|
||||||
|
|
||||||
def stopped(self) -> bool:
|
def stopped(self) -> bool:
|
||||||
return self._stop
|
return self._stop
|
||||||
|
|
||||||
@ -164,23 +133,12 @@ class Daemon:
|
|||||||
model=cfg.stt_model, language=cfg.stt_language,
|
model=cfg.stt_model, language=cfg.stt_language,
|
||||||
device=cfg.stt_compute if cfg.stt_compute in ("cpu", "cuda") else "auto",
|
device=cfg.stt_compute if cfg.stt_compute in ("cpu", "cuda") else "auto",
|
||||||
compute_type="auto",
|
compute_type="auto",
|
||||||
initial_prompt=grammar.initial_prompt(cfg.wake_phrases),
|
|
||||||
)
|
)
|
||||||
self._load_contexts()
|
|
||||||
if audio.warm_up(cfg.samplerate, cfg.channels, self._device):
|
if audio.warm_up(cfg.samplerate, cfg.channels, self._device):
|
||||||
log.info("mic warmed up (source live)")
|
log.info("mic warmed up (source live)")
|
||||||
else:
|
else:
|
||||||
log.warning("mic warm-up saw only silence — check mic permission / RDPSource")
|
log.warning("mic warm-up saw only silence — check mic permission / RDPSource")
|
||||||
|
|
||||||
def _load_contexts(self) -> None:
|
|
||||||
"""(re)load contexts.toml, leaving the loaded model untouched. a parse error is
|
|
||||||
logged and leaves the previous set in place rather than crashing the loop."""
|
|
||||||
try:
|
|
||||||
self._contexts = load_contexts()
|
|
||||||
except ContextsError as exc:
|
|
||||||
log.warning("contexts.toml invalid, keeping previous set: %s", exc)
|
|
||||||
self._console.emit(SYSTEM, f"contexts.toml error (kept previous): {exc}", "red")
|
|
||||||
|
|
||||||
def _capture(self):
|
def _capture(self):
|
||||||
cfg = self.config
|
cfg = self.config
|
||||||
if self.mode == "ptt":
|
if self.mode == "ptt":
|
||||||
@ -190,289 +148,60 @@ class Daemon:
|
|||||||
return audio.record_while(
|
return audio.record_while(
|
||||||
cfg.samplerate, cfg.channels, self._device,
|
cfg.samplerate, cfg.channels, self._device,
|
||||||
held=lambda: not self._ptt.wait_press(self.stopped),
|
held=lambda: not self._ptt.wait_press(self.stopped),
|
||||||
max_utterance=cfg.vad_max_seconds, min_utterance=cfg.min_utterance,
|
max_utterance=cfg.max_utterance, min_utterance=cfg.min_utterance,
|
||||||
)
|
)
|
||||||
return audio.record_until_silence(
|
return audio.record_until_silence(
|
||||||
cfg.samplerate, cfg.channels, self._device,
|
cfg.samplerate, cfg.channels, self._device,
|
||||||
silence_threshold=cfg.silence_threshold, silence_duration=cfg.vad_silence_ms / 1000.0,
|
silence_threshold=cfg.silence_threshold, silence_duration=cfg.silence_duration,
|
||||||
min_utterance=cfg.min_utterance, max_utterance=cfg.vad_max_seconds,
|
min_utterance=cfg.min_utterance, max_utterance=cfg.max_utterance,
|
||||||
stop=self.stopped,
|
stop=self.stopped,
|
||||||
)
|
)
|
||||||
|
|
||||||
def _handle(self, transcript: str) -> None:
|
def _handle(self, transcript: str) -> None:
|
||||||
cfg = self.config
|
cfg = self.config
|
||||||
require_wake = self.mode == "listen"
|
require_wake = self.mode == "listen"
|
||||||
parsed = grammar.parse(transcript, cfg.wake_phrases, cfg.wake_fuzzy_threshold,
|
action = grammar.parse(transcript, cfg.wake_phrases, cfg.match_threshold, require_wake)
|
||||||
cfg.command_fuzzy_threshold, require_wake, filler=cfg.filler_words)
|
if action is None:
|
||||||
if parsed is None or parsed.action is None:
|
self._emit(f'heard: "{transcript}" -> no command matched')
|
||||||
if parsed is not None:
|
|
||||||
self._earcons.play("wake")
|
|
||||||
self._console.emit(VOICE, f'heard "{transcript}" -> no command matched {self._timing()}',
|
|
||||||
"yellow")
|
|
||||||
self._earcons.play("no_match")
|
|
||||||
return
|
return
|
||||||
action = parsed.action
|
|
||||||
self._earcons.play("wake")
|
|
||||||
|
|
||||||
# a command was recognized — echo what we heard (green) before acting. note the
|
|
||||||
# matched wake phrase (magenta) when the transcript didn't literally contain it
|
|
||||||
# (so a loose match like "okay clouds" -> "okay claude" is visible).
|
|
||||||
head = self._console.paint(f'heard "{transcript}" -> {self._describe(action)}', "green")
|
|
||||||
note = ""
|
|
||||||
if parsed.wake and parsed.wake.replace(" ", "") not in transcript.lower().replace(" ", ""):
|
|
||||||
note = (self._console.paint(" (wake: ", "green")
|
|
||||||
+ self._console.paint(parsed.wake, "magenta")
|
|
||||||
+ self._console.paint(")", "green"))
|
|
||||||
tail = self._console.paint(f" {self._timing()}", "green")
|
|
||||||
self._console.emit(VOICE, f"{head}{note}{tail}")
|
|
||||||
|
|
||||||
def blue(s):
|
|
||||||
return self._console.paint(s, "brightblue")
|
|
||||||
if action.name == "mode":
|
if action.name == "mode":
|
||||||
new_mode = str(action.arg)
|
new_mode = str(action.arg)
|
||||||
if new_mode != self.mode:
|
if new_mode != self.mode:
|
||||||
self.mode = new_mode
|
self.mode = new_mode
|
||||||
self._console.emit(SYSTEM, f"{blue('mode')} -> {new_mode}")
|
self._emit(f"mode -> {new_mode}")
|
||||||
self._refresh_state()
|
self._refresh_state()
|
||||||
return
|
return
|
||||||
if action.name == "set":
|
if action.name == "switch":
|
||||||
session = target.set_target(str(action.arg))
|
session = target.set_target(str(action.arg))
|
||||||
self._pending.pop(session, None)
|
self._emit(f"target -> {session}")
|
||||||
self._console.emit(SYSTEM, f"{blue('set sticky')} -> {session}")
|
|
||||||
self._refresh_state()
|
self._refresh_state()
|
||||||
return
|
return
|
||||||
if action.name == "unset":
|
|
||||||
target.unset_target()
|
|
||||||
self._console.emit(SYSTEM, f"{blue('unset')} (cleared)")
|
|
||||||
self._refresh_state()
|
|
||||||
return
|
|
||||||
if action.name == "list":
|
|
||||||
sessions = target.list_sessions()
|
|
||||||
self._console.emit(SYSTEM, f"{blue('list')} -> "
|
|
||||||
+ (", ".join(sessions) if sessions else "(none running)"))
|
|
||||||
return
|
|
||||||
if action.name == "commands":
|
|
||||||
self._console.emit(HELP, "voice commands:")
|
|
||||||
for usage, desc in grammar.command_menu():
|
|
||||||
self._console.line(f" {self._console.paint(f'{usage:<26}', 'brightblue')} {desc}")
|
|
||||||
return
|
|
||||||
if action.name == "customs":
|
|
||||||
names = self._contexts.names()
|
|
||||||
listed = ", ".join(names) if names else "(none — edit contexts.toml)"
|
|
||||||
self._console.emit(SYSTEM, f"contexts: {listed}")
|
|
||||||
return
|
|
||||||
if action.name == "version":
|
|
||||||
self._console.emit(SYSTEM, f"claudedo {__version__}")
|
|
||||||
return
|
|
||||||
if action.name == "debug":
|
|
||||||
self._console.emit(VOICE, f'debug: "{action.arg}"', "yellow")
|
|
||||||
return
|
|
||||||
if action.name == "reload":
|
|
||||||
self._do_reload(str(action.arg))
|
|
||||||
return
|
|
||||||
if action.name == "system":
|
|
||||||
self._do_system(action.arg)
|
|
||||||
return
|
|
||||||
if action.name == "context":
|
|
||||||
name = str(action.arg[0])
|
|
||||||
if self._contexts.get(name) is None:
|
|
||||||
self._console.emit(VOICE, f"no context named '{name}' -> did nothing", "red")
|
|
||||||
self._earcons.play("no_match")
|
|
||||||
return
|
|
||||||
|
|
||||||
session, reason = target.resolve(parsed.one_shot, auto_target=cfg.auto_target)
|
session = target.resolve_target()
|
||||||
if session is None:
|
if session is None:
|
||||||
self._console.emit(VOICE, f'heard "{transcript}" -> {reason} -> '
|
self._emit(f'heard: "{transcript}" -> matched: {self._describe(action)} '
|
||||||
f'{self._describe(action)} did nothing', "red")
|
f'-> ERROR no target session (did nothing)')
|
||||||
self._earcons.play("no_match")
|
|
||||||
return
|
return
|
||||||
if action.name == "context":
|
self._emit(f'heard: "{transcript}" -> matched: {self._describe(action)} -> target {session}')
|
||||||
self._inject_context(session, action)
|
if action.name == "type" and not cfg.type_autosend:
|
||||||
|
inject.send_literal(session, str(action.arg))
|
||||||
|
self._emit(f"injected: literal {str(action.arg)!r} -> {session}")
|
||||||
return
|
return
|
||||||
self._inject(session, action)
|
|
||||||
|
|
||||||
def _inject(self, session: str, action) -> None:
|
|
||||||
"""run a resolved command against `session`, tracking the uncommitted-input
|
|
||||||
buffer so backspace/erase delete only back to the last submit boundary.
|
|
||||||
|
|
||||||
the 'heard ...' echo is already printed by _handle and the [session] prefix
|
|
||||||
names the target, so these lines just report the keystrokes injected. the
|
|
||||||
earcon fires here (a real injection): submit chimes the submit tone, every
|
|
||||||
other injected command the accept tone.
|
|
||||||
"""
|
|
||||||
name = action.name
|
|
||||||
self._earcons.play("submit" if name == "submit" else "accept")
|
|
||||||
|
|
||||||
if name == "type":
|
|
||||||
text = str(action.arg)
|
|
||||||
inject.send_literal(session, text)
|
|
||||||
self._pending[session] = self._pending.get(session, 0) + len(text)
|
|
||||||
if self.config.type_autosend:
|
|
||||||
inject.send_named(session, keys.SUBMIT)
|
|
||||||
self._pending[session] = 0
|
|
||||||
self._console.emit(session, f"typed {text!r}"
|
|
||||||
+ (" + send" if self.config.type_autosend else ""), "green")
|
|
||||||
return
|
|
||||||
if name == "space":
|
|
||||||
n = int(action.arg)
|
|
||||||
inject.perform(session, action)
|
inject.perform(session, action)
|
||||||
self._pending[session] = self._pending.get(session, 0) + n
|
self._emit(f"injected: {self._describe(action)} -> {session}")
|
||||||
self._console.emit(session, f"space x{n}", "green")
|
|
||||||
return
|
|
||||||
if name == "backspace":
|
|
||||||
n = int(action.arg)
|
|
||||||
if n:
|
|
||||||
inject.perform(session, action)
|
|
||||||
self._pending[session] = max(0, self._pending.get(session, 0) - n)
|
|
||||||
self._console.emit(session, f"backspace x{n}", "green")
|
|
||||||
return
|
|
||||||
if name == "erase":
|
|
||||||
n = self._pending.get(session, 0)
|
|
||||||
if n:
|
|
||||||
inject.perform(session, grammar.Action("erase", n))
|
|
||||||
self._pending[session] = 0
|
|
||||||
self._console.emit(session, f"erase x{n} (to last boundary)", "green")
|
|
||||||
return
|
|
||||||
|
|
||||||
inject.perform(session, action)
|
|
||||||
if name == "submit":
|
|
||||||
self._pending[session] = 0
|
|
||||||
self._console.emit(session, f"injected {self._describe(action)}", "green")
|
|
||||||
|
|
||||||
def _inject_context(self, session: str, action) -> None:
|
|
||||||
"""inject a named context blurb ahead of the dictated instruction, then WAIT.
|
|
||||||
|
|
||||||
read-before-send: never auto-submits — the user says ``send`` separately, and
|
|
||||||
claude's own permission prompt is the backstop for anything consequential.
|
|
||||||
routes through inject.send_literal (the same path as ``type``) and tracks the
|
|
||||||
uncommitted-input buffer so backspace/erase still bound to the last boundary.
|
|
||||||
|
|
||||||
assembly (config behavior.context_multiline): true -> blurb, a soft Shift+Enter
|
|
||||||
newline, then the instruction; false -> blurb + context_separator + instruction
|
|
||||||
flattened onto one line. a bare ``context <name>`` (no dictation) injects just
|
|
||||||
the blurb. the soft newline does not count toward the editable-char buffer.
|
|
||||||
"""
|
|
||||||
cfg = self.config
|
|
||||||
name, dictation = str(action.arg[0]), str(action.arg[1])
|
|
||||||
blurb = self._contexts.get(name) or ""
|
|
||||||
|
|
||||||
self._earcons.play("accept")
|
|
||||||
inject.send_literal(session, blurb)
|
|
||||||
chars = len(blurb)
|
|
||||||
if dictation:
|
|
||||||
if cfg.context_multiline:
|
|
||||||
inject.send_named(session, keys.NEWLINE)
|
|
||||||
else:
|
|
||||||
inject.send_literal(session, cfg.context_separator)
|
|
||||||
chars += len(cfg.context_separator)
|
|
||||||
inject.send_literal(session, dictation)
|
|
||||||
chars += len(dictation)
|
|
||||||
self._pending[session] = self._pending.get(session, 0) + chars
|
|
||||||
|
|
||||||
shape = "blurb" if not dictation else "blurb + dictation"
|
|
||||||
self._console.emit(session, f"context '{name}' -> {shape} (waiting for send)", "green")
|
|
||||||
|
|
||||||
def _do_reload(self, scope: str) -> None:
|
|
||||||
"""re-read config.toml and/or contexts.toml live without reinitializing the
|
|
||||||
loaded whisper model (the slow part). scope: all|config|contexts."""
|
|
||||||
did = []
|
|
||||||
if scope in ("all", "config"):
|
|
||||||
try:
|
|
||||||
new_cfg = load_config()
|
|
||||||
self._apply_config(new_cfg)
|
|
||||||
did.append("config")
|
|
||||||
except ConfigError as exc:
|
|
||||||
self._console.emit(SYSTEM, f"config reload failed (kept previous): {exc}", "red")
|
|
||||||
if scope in ("all", "contexts"):
|
|
||||||
self._load_contexts()
|
|
||||||
did.append("contexts")
|
|
||||||
what = " + ".join(did) if did else "nothing"
|
|
||||||
blue = self._console.paint("reloaded", "brightblue")
|
|
||||||
self._console.emit(SYSTEM, f"{blue} {what} ({len(self._contexts)} contexts)")
|
|
||||||
|
|
||||||
def _apply_config(self, new_cfg: Config) -> None:
|
|
||||||
"""swap in a reloaded config, preserving the runtime mode the user may have
|
|
||||||
toggled by voice and leaving the already-loaded transcriber untouched."""
|
|
||||||
new_cfg.mode = self.mode
|
|
||||||
self.config = new_cfg
|
|
||||||
self._earcons.update(new_cfg)
|
|
||||||
|
|
||||||
def _do_system(self, arg) -> None:
|
|
||||||
"""daemon-control namespace (never injects to claude): status / reload."""
|
|
||||||
if isinstance(arg, tuple) and arg and arg[0] == "reload":
|
|
||||||
self._do_reload(str(arg[1]))
|
|
||||||
return
|
|
||||||
if isinstance(arg, tuple) and arg and arg[0] == "unknown":
|
|
||||||
self._console.emit(SYSTEM, f"unknown system command '{arg[1]}'", "red")
|
|
||||||
return
|
|
||||||
if arg == "status":
|
|
||||||
cfg = self.config
|
|
||||||
sticky = target.read_active() or "(none)"
|
|
||||||
blue = self._console.paint("status", "brightblue")
|
|
||||||
self._console.emit(SYSTEM, f"{blue}: mode {self.mode}, sticky {sticky}, "
|
|
||||||
f"model {cfg.stt_model}, {len(self._contexts)} contexts")
|
|
||||||
return
|
|
||||||
if arg == "cleanup":
|
|
||||||
self._do_cleanup()
|
|
||||||
return
|
|
||||||
if arg == "confirm":
|
|
||||||
blue = self._console.paint("cleanup", "brightblue")
|
|
||||||
if self._cleanup_pending:
|
|
||||||
self._run_cleanup(blue)
|
|
||||||
else:
|
|
||||||
self._console.emit(SYSTEM, f"{blue}: nothing pending to confirm")
|
|
||||||
return
|
|
||||||
self._console.emit(SYSTEM, f"unknown system command {arg!r}", "red")
|
|
||||||
|
|
||||||
def _do_cleanup(self) -> None:
|
|
||||||
"""kill detached claude-* sessions (never attached), report killed + kept.
|
|
||||||
|
|
||||||
detached-only is the safety model: a misheard voice cleanup cannot nuke the
|
|
||||||
active (attached) session. with behavior.cleanup_confirm the daemon announces
|
|
||||||
the detached set and waits for a following ``confirm`` instead of killing now.
|
|
||||||
"""
|
|
||||||
blue = self._console.paint("cleanup", "brightblue")
|
|
||||||
if self.config.cleanup_confirm:
|
|
||||||
pending = [n for n, attached in target._claude_sessions() if not attached]
|
|
||||||
if not pending:
|
|
||||||
self._console.emit(SYSTEM, f"{blue}: nothing to clean (no detached sessions)")
|
|
||||||
return
|
|
||||||
self._cleanup_pending = True
|
|
||||||
self._console.emit(SYSTEM, f"{blue}: would kill {', '.join(sorted(pending))} "
|
|
||||||
f"— say 'confirm' to proceed")
|
|
||||||
return
|
|
||||||
self._run_cleanup(blue)
|
|
||||||
|
|
||||||
def _run_cleanup(self, blue: str) -> None:
|
|
||||||
killed, kept = target.cleanup_detached()
|
|
||||||
self._cleanup_pending = False
|
|
||||||
if not killed:
|
|
||||||
self._console.emit(SYSTEM, f"{blue}: nothing to clean (no detached sessions)")
|
|
||||||
return
|
|
||||||
msg = f"{blue}: killed {', '.join(killed)}"
|
|
||||||
if kept:
|
|
||||||
msg += f"; kept {', '.join(kept)} (attached)"
|
|
||||||
self._console.emit(SYSTEM, msg)
|
|
||||||
|
|
||||||
def _timing(self) -> str:
|
|
||||||
"""compact STT latency suffix for heard lines (transcribe ms on audio secs)"""
|
|
||||||
return f"({self._last_stt_ms:.0f}ms/{self._last_audio_s:.1f}s)"
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _describe(action) -> str:
|
def _describe(action) -> str:
|
||||||
if action.name == "context":
|
|
||||||
name, dictation = action.arg
|
|
||||||
tail = " + dictation" if dictation else ""
|
|
||||||
return f"CONTEXT('{name}'{tail})"
|
|
||||||
if action.name == "system":
|
|
||||||
arg = action.arg
|
|
||||||
if isinstance(arg, tuple):
|
|
||||||
return f"SYSTEM({arg[0]} {arg[1]})"
|
|
||||||
return f"SYSTEM({arg})"
|
|
||||||
if action.arg is None:
|
if action.arg is None:
|
||||||
return action.name.upper()
|
return action.name.upper()
|
||||||
return f"{action.name.upper()}({action.arg})"
|
return f"{action.name.upper()}({action.arg})"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _emit(line: str) -> None:
|
||||||
|
"""print a recognition/action line to the watched terminal"""
|
||||||
|
print(line, flush=True)
|
||||||
|
|
||||||
def _has_wake(self, transcript: str) -> bool:
|
def _has_wake(self, transcript: str) -> bool:
|
||||||
"""true if the utterance starts with a wake phrase (listen-mode gate).
|
"""true if the utterance starts with a wake phrase (listen-mode gate).
|
||||||
|
|
||||||
@ -480,18 +209,20 @@ class Daemon:
|
|||||||
invariant: non-command speech is discarded, never recorded.
|
invariant: non-command speech is discarded, never recorded.
|
||||||
"""
|
"""
|
||||||
cfg = self.config
|
cfg = self.config
|
||||||
return grammar.strip_wake(transcript, cfg.wake_phrases,
|
return grammar.strip_wake(transcript, cfg.wake_phrases, cfg.match_threshold, True) is not None
|
||||||
cfg.wake_fuzzy_threshold, True) is not None
|
|
||||||
|
|
||||||
def _print_startup(self) -> None:
|
def _print_startup(self) -> None:
|
||||||
cfg = self.config
|
cfg = self.config
|
||||||
dev = cfg.stt_device if cfg.stt_device != "auto" else "default"
|
dev = cfg.stt_device if cfg.stt_device != "auto" else "default"
|
||||||
target_now = target.read_active() or "(none — run cc / set <name>)"
|
target_now = target.read_active() or "(none — run cc to attach)"
|
||||||
self._console.emit(SYSTEM, f"claudedo {self.mode} mode — Ctrl-C to stop", "bold")
|
self._emit("── claudedo ─────────────────────────────────")
|
||||||
self._console.emit(SYSTEM, f"model {cfg.stt_model} ({cfg.stt_language}) · mic {dev} · "
|
self._emit(f" model: {cfg.stt_model} ({cfg.stt_language})")
|
||||||
f"target {target_now} · {len(self._contexts)} contexts")
|
self._emit(f" mic: {dev}")
|
||||||
wakes = ", ".join(self._console.paint(p, "magenta") for p in cfg.wake_phrases)
|
self._emit(f" mode: {self.mode}")
|
||||||
self._console.emit(SYSTEM, f"wake: {wakes}")
|
self._emit(f" target: {target_now}")
|
||||||
|
self._emit(f" wake: {', '.join(cfg.wake_phrases)}")
|
||||||
|
self._emit(" Ctrl-C to stop")
|
||||||
|
self._emit("─────────────────────────────────────────────")
|
||||||
|
|
||||||
def _refresh_state(self) -> None:
|
def _refresh_state(self) -> None:
|
||||||
write_state(os.getpid(), self.mode, target.read_active())
|
write_state(os.getpid(), self.mode, target.read_active())
|
||||||
@ -506,25 +237,16 @@ class Daemon:
|
|||||||
self._refresh_state()
|
self._refresh_state()
|
||||||
self._print_startup()
|
self._print_startup()
|
||||||
while not self._stop:
|
while not self._stop:
|
||||||
if self._reload_pending:
|
|
||||||
self._reload_pending = False
|
|
||||||
self._do_reload("all")
|
|
||||||
audio_chunk = self._capture()
|
audio_chunk = self._capture()
|
||||||
if self._stop:
|
if self._stop:
|
||||||
break
|
break
|
||||||
if audio_chunk is None:
|
if audio_chunk is None:
|
||||||
continue
|
continue
|
||||||
t0 = time.monotonic()
|
|
||||||
transcript = self._transcriber.transcribe(audio_chunk, self.config.samplerate)
|
transcript = self._transcriber.transcribe(audio_chunk, self.config.samplerate)
|
||||||
self._last_stt_ms = (time.monotonic() - t0) * 1000.0
|
|
||||||
self._last_audio_s = audio_chunk.size / self.config.samplerate
|
|
||||||
if not transcript:
|
if not transcript:
|
||||||
continue
|
continue
|
||||||
if self.mode == "listen" and not self._has_wake(transcript):
|
if self.mode == "listen" and not self._has_wake(transcript):
|
||||||
if self.config.print_heard:
|
self._emit("dropped: non-wake speech (not recorded)")
|
||||||
self._console.emit(VOICE, f'heard (dropped) "{transcript}" {self._timing()}', "red")
|
|
||||||
else:
|
|
||||||
self._console.emit(VOICE, "dropped: non-wake speech (not recorded)", "dim")
|
|
||||||
continue
|
continue
|
||||||
self._handle(transcript)
|
self._handle(transcript)
|
||||||
finally:
|
finally:
|
||||||
|
|||||||
@ -27,86 +27,20 @@ _NUMBER_WORDS = {
|
|||||||
|
|
||||||
_INDEX_WORDS = {"1": 1, "2": 2, "3": 3, "4": 4}
|
_INDEX_WORDS = {"1": 1, "2": 2, "3": 3, "4": 4}
|
||||||
|
|
||||||
_COUNT_WORDS = {
|
|
||||||
"five": 5, "six": 6, "seven": 7, "eight": 8, "nine": 9, "ten": 10,
|
|
||||||
"eleven": 11, "twelve": 12, "thirteen": 13, "fourteen": 14, "fifteen": 15,
|
|
||||||
"sixteen": 16, "seventeen": 17, "eighteen": 18, "nineteen": 19, "twenty": 20,
|
|
||||||
}
|
|
||||||
|
|
||||||
_YES_VERBS = ("yes", "yeah", "yep", "yup")
|
|
||||||
_NO_VERBS = ("no", "nope", "nah")
|
|
||||||
_APPROVE_VERBS = ("approve", "allow")
|
|
||||||
_DENY_VERBS = ("deny", "reject")
|
|
||||||
_SUBMIT_VERBS = ("send", "enter", "submit")
|
|
||||||
_CANCEL_VERBS = ("cancel", "escape")
|
|
||||||
_TYPE_VERBS = ("type", "dictate", "write")
|
|
||||||
_BACKSPACE_VERBS = ("backspace", "delete")
|
|
||||||
_SPACE_VERBS = ("space", "spacebar")
|
|
||||||
_ADD_VERBS = ("add", "insert")
|
|
||||||
_ERASE_VERBS = ("erase", "clear", "wipe")
|
|
||||||
_DEBUG_VERBS = ("debug", "echo")
|
|
||||||
_MODE_VERBS = ("mode",)
|
|
||||||
_STICKY_VERBS = ("set", "sticky", "switch")
|
|
||||||
_ONESHOT_VERBS = ("target",)
|
|
||||||
_UNSET_VERBS = ("unset", "unsticky")
|
|
||||||
_LIST_VERBS = ("list", "sessions")
|
|
||||||
_COMMANDS_VERBS = ("commands", "help", "menu")
|
|
||||||
_CUSTOMS_VERBS = ("customs", "custom")
|
|
||||||
_VERSION_VERBS = ("version",)
|
|
||||||
_SELECT_VERBS = ("select", "option", "choose", "number")
|
|
||||||
_CONTEXT_VERBS = ("context", "prepare")
|
|
||||||
_RELOAD_VERBS = ("reload",)
|
|
||||||
_SYSTEM_VERBS = ("system",)
|
|
||||||
_RELOAD_SCOPES = ("config", "contexts")
|
|
||||||
_CLEANUP_VERBS = ("detached", "detach", "cleanup")
|
|
||||||
_CONFIRM_VERBS = ("confirm",)
|
|
||||||
|
|
||||||
# every command/synonym word, for biasing the STT toward the vocabulary we expect.
|
|
||||||
_COMMAND_WORDS = (
|
|
||||||
_YES_VERBS + _NO_VERBS + _APPROVE_VERBS + _DENY_VERBS + _SUBMIT_VERBS
|
|
||||||
+ _CANCEL_VERBS + _TYPE_VERBS + _BACKSPACE_VERBS + _SPACE_VERBS + _ADD_VERBS
|
|
||||||
+ _ERASE_VERBS + _DEBUG_VERBS + _MODE_VERBS + _STICKY_VERBS + _ONESHOT_VERBS + _UNSET_VERBS
|
|
||||||
+ _LIST_VERBS + _COMMANDS_VERBS + _CUSTOMS_VERBS + _VERSION_VERBS
|
|
||||||
+ _CONTEXT_VERBS + _RELOAD_VERBS + _SYSTEM_VERBS + _RELOAD_SCOPES + _CLEANUP_VERBS
|
|
||||||
+ _CONFIRM_VERBS + _SELECT_VERBS + ("ptt", "listen")
|
|
||||||
+ ("one", "two", "three", "four")
|
|
||||||
)
|
|
||||||
DEFAULT_FILLER = ("select", "use", "choose")
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
class Action:
|
class Action:
|
||||||
"""a matched command: a name plus an optional argument.
|
"""a matched command: a name plus an optional argument.
|
||||||
|
|
||||||
names: yes, no, select, approve, deny, submit, type, space, backspace, erase,
|
names: yes, no, select, approve, deny, submit, type, mode, switch, cancel.
|
||||||
cancel, mode, set, unset, list, context, reload, system. arg carries the select
|
arg carries the select index (int), the literal text for ``type``, the mode for
|
||||||
index (int), the literal text for ``type``, the count for ``space``/``backspace``
|
``mode``, or the session short-name for ``switch``.
|
||||||
(int), the mode for ``mode``, the session short-name for ``set``, a
|
|
||||||
``(name, dictation)`` tuple for ``context``, the scope string for ``reload``
|
|
||||||
(``"all"``/``"config"``/``"contexts"``), or the system control for ``system``
|
|
||||||
(``"status"`` or a ``("reload", scope)`` tuple).
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
name: str
|
name: str
|
||||||
arg: object = None
|
arg: object = None
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class ParsedCommand:
|
|
||||||
"""a fully parsed utterance: an optional one-shot target plus the command action.
|
|
||||||
|
|
||||||
one_shot is the session short-name from a leading ``target <name>`` (this command
|
|
||||||
only; does not change the sticky default), or None. action is the command to run,
|
|
||||||
or None if nothing matched after the wake phrase / one-shot / filler. wake is the
|
|
||||||
configured wake phrase that matched (e.g. "okay claude" for a heard "okay clouds"),
|
|
||||||
or None.
|
|
||||||
"""
|
|
||||||
|
|
||||||
one_shot: str | None
|
|
||||||
action: Action | None
|
|
||||||
wake: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
def normalize(text: str) -> str:
|
def normalize(text: str) -> str:
|
||||||
"""lowercase, strip punctuation, collapse whitespace, map number words to digits"""
|
"""lowercase, strip punctuation, collapse whitespace, map number words to digits"""
|
||||||
text = text.lower().strip()
|
text = text.lower().strip()
|
||||||
@ -118,57 +52,6 @@ def normalize(text: str) -> str:
|
|||||||
return " ".join(tokens)
|
return " ".join(tokens)
|
||||||
|
|
||||||
|
|
||||||
def vocabulary(wake_phrases: list[str]) -> list[str]:
|
|
||||||
"""the wake + command vocabulary, deduped in first-seen order.
|
|
||||||
|
|
||||||
single source for biasing the STT: the same wake phrases the matcher uses plus
|
|
||||||
every command/synonym word in _COMMAND_WORDS. no separate hardcoded copy.
|
|
||||||
"""
|
|
||||||
seen: dict[str, None] = {}
|
|
||||||
for word in list(wake_phrases) + list(_COMMAND_WORDS):
|
|
||||||
key = word.strip()
|
|
||||||
if key and key not in seen:
|
|
||||||
seen[key] = None
|
|
||||||
return list(seen)
|
|
||||||
|
|
||||||
|
|
||||||
def initial_prompt(wake_phrases: list[str]) -> str:
|
|
||||||
"""a comma-joined vocabulary string to pass faster-whisper as initial_prompt,
|
|
||||||
conditioning transcription toward the words we expect (esp. the coined wake)"""
|
|
||||||
return ", ".join(vocabulary(wake_phrases))
|
|
||||||
|
|
||||||
|
|
||||||
def command_menu() -> list[tuple[str, str]]:
|
|
||||||
"""the voice command menu as (usage, description) rows, for the `commands` cmd.
|
|
||||||
|
|
||||||
a small curated list keyed off the verb groups — the speakable command surface,
|
|
||||||
NOT the cc shell kit.
|
|
||||||
"""
|
|
||||||
return [
|
|
||||||
("yes / no", "answer a yes/no prompt"),
|
|
||||||
("one..four", "pick numbered option 1-4"),
|
|
||||||
("approve / deny", "allow / deny a permission prompt"),
|
|
||||||
("send", "submit (Enter)"),
|
|
||||||
("cancel", "back out (Escape)"),
|
|
||||||
("type <text>", "insert literal text (no submit)"),
|
|
||||||
("space [n] / add a space", "insert n spaces"),
|
|
||||||
("backspace [n]", "delete n chars (to last submit)"),
|
|
||||||
("erase", "wipe the current input"),
|
|
||||||
("debug <text>", "echo to console (no inject)"),
|
|
||||||
("set <name>", "sticky target -> claude-<name>"),
|
|
||||||
("target <name> <cmd>", "one-shot to another session"),
|
|
||||||
("unset / list", "clear sticky / list sessions"),
|
|
||||||
("mode ptt|listen", "switch input mode"),
|
|
||||||
("context <name> <text>", "inject a contexts.toml blurb + dictation (no submit)"),
|
|
||||||
("reload", "re-read config.toml + contexts.toml live"),
|
|
||||||
("system status", "print mode/target/model/contexts to the console"),
|
|
||||||
("system reload [config|contexts]", "reload one or both config files"),
|
|
||||||
("cleanup / detached", "kill detached claude-* sessions (never attached)"),
|
|
||||||
("commands / customs", "this menu / list loaded contexts"),
|
|
||||||
("version", "print the claudedo version"),
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
def _ratio(a: str, b: str) -> float:
|
def _ratio(a: str, b: str) -> float:
|
||||||
return SequenceMatcher(None, a, b).ratio()
|
return SequenceMatcher(None, a, b).ratio()
|
||||||
|
|
||||||
@ -179,15 +62,13 @@ def _wake_variants(phrase: str) -> set[str]:
|
|||||||
return {norm, norm.replace(" ", "")}
|
return {norm, norm.replace(" ", "")}
|
||||||
|
|
||||||
|
|
||||||
def strip_wake_match(transcript: str, wake_phrases: list[str], threshold: float,
|
def strip_wake(transcript: str, wake_phrases: list[str], threshold: float,
|
||||||
require_wake: bool) -> tuple[str | None, str | None]:
|
require_wake: bool) -> str | None:
|
||||||
"""return (command remainder, matched wake phrase).
|
"""return the command remainder after the wake phrase.
|
||||||
|
|
||||||
if ``require_wake`` (listen mode) and no wake phrase is found at the start, the
|
if ``require_wake`` (listen mode) and no wake phrase is found at the start,
|
||||||
remainder is None so the daemon discards the utterance. if not required (ptt
|
return None so the daemon discards the utterance. if not required (ptt mode),
|
||||||
mode), a leading wake phrase is stripped when present but its absence is fine.
|
a leading wake phrase is stripped when present but its absence is fine.
|
||||||
the matched phrase is the configured wake phrase that best matched (e.g. "okay
|
|
||||||
claude" for a heard "okay clouds"), or None when none matched.
|
|
||||||
|
|
||||||
matches leniently on a despaced prefix (whisper splits/joins the coined word
|
matches leniently on a despaced prefix (whisper splits/joins the coined word
|
||||||
inconsistently) but always slices the remainder on a WORD boundary of the
|
inconsistently) but always slices the remainder on a WORD boundary of the
|
||||||
@ -195,11 +76,10 @@ def strip_wake_match(transcript: str, wake_phrases: list[str], threshold: float,
|
|||||||
"""
|
"""
|
||||||
norm = normalize(transcript)
|
norm = normalize(transcript)
|
||||||
if not norm:
|
if not norm:
|
||||||
return (None, None) if require_wake else ("", None)
|
return None if require_wake else ""
|
||||||
words = norm.split(" ")
|
words = norm.split(" ")
|
||||||
|
|
||||||
best_remainder: str | None = None
|
best_remainder: str | None = None
|
||||||
best_phrase: str | None = None
|
|
||||||
best_score = 0.0
|
best_score = 0.0
|
||||||
for phrase in wake_phrases:
|
for phrase in wake_phrases:
|
||||||
variants = _wake_variants(phrase)
|
variants = _wake_variants(phrase)
|
||||||
@ -213,82 +93,18 @@ def strip_wake_match(transcript: str, wake_phrases: list[str], threshold: float,
|
|||||||
if score >= threshold and score > best_score:
|
if score >= threshold and score > best_score:
|
||||||
best_score = score
|
best_score = score
|
||||||
best_remainder = " ".join(words[take:]).strip()
|
best_remainder = " ".join(words[take:]).strip()
|
||||||
best_phrase = phrase
|
|
||||||
|
|
||||||
if best_remainder is not None:
|
if best_remainder is not None:
|
||||||
return best_remainder, best_phrase
|
return best_remainder
|
||||||
return (None, None) if require_wake else (norm, None)
|
return None if require_wake else norm
|
||||||
|
|
||||||
|
|
||||||
def strip_wake(transcript: str, wake_phrases: list[str], threshold: float,
|
|
||||||
require_wake: bool) -> str | None:
|
|
||||||
"""return the command remainder after the wake phrase (None if no wake in listen
|
|
||||||
mode). thin wrapper over strip_wake_match for callers that don't need the phrase"""
|
|
||||||
return strip_wake_match(transcript, wake_phrases, threshold, require_wake)[0]
|
|
||||||
|
|
||||||
|
|
||||||
def _fuzzy_in(token: str, options: tuple[str, ...], threshold: float) -> bool:
|
def _fuzzy_in(token: str, options: tuple[str, ...], threshold: float) -> bool:
|
||||||
return any(_ratio(token, opt) >= threshold for opt in options)
|
return any(_ratio(token, opt) >= threshold for opt in options)
|
||||||
|
|
||||||
|
|
||||||
def _leading_count(rest: list[str], default: int = 1) -> int:
|
|
||||||
"""read a count from the first token (digit or number word), else the default.
|
|
||||||
|
|
||||||
'backspace 3' -> 3, 'backspace ten' -> 10 (normalize maps small words to digits;
|
|
||||||
larger words come from _COUNT_WORDS), 'backspace' -> default.
|
|
||||||
"""
|
|
||||||
if not rest:
|
|
||||||
return default
|
|
||||||
tok = rest[0]
|
|
||||||
if tok.isdigit():
|
|
||||||
return max(0, int(tok))
|
|
||||||
if tok in _COUNT_WORDS:
|
|
||||||
return _COUNT_WORDS[tok]
|
|
||||||
return default
|
|
||||||
|
|
||||||
|
|
||||||
def _match_reload(rest: list[str], threshold: float, bare_default: str) -> Action | None:
|
|
||||||
"""map the tokens after a ``reload`` verb to a reload Action.
|
|
||||||
|
|
||||||
bare reload -> the caller's default scope ("all" for the bare command, the
|
|
||||||
``("reload", scope)`` tuple for ``system reload``). a trailing ``config``/
|
|
||||||
``contexts`` narrows the scope; an unrecognized scope falls back to the default.
|
|
||||||
"""
|
|
||||||
scope = bare_default
|
|
||||||
if rest and _fuzzy_in(rest[0], ("config", "configuration"), threshold):
|
|
||||||
scope = "config"
|
|
||||||
elif rest and _fuzzy_in(rest[0], ("contexts", "context"), threshold):
|
|
||||||
scope = "contexts"
|
|
||||||
return Action("reload", scope)
|
|
||||||
|
|
||||||
|
|
||||||
def _match_system(rest: list[str], threshold: float) -> Action | None:
|
|
||||||
"""map the tokens after the reserved ``system`` word to a daemon-control Action.
|
|
||||||
|
|
||||||
the ``system`` namespace never injects into claude. v0.2.0 scope: ``status`` and
|
|
||||||
``reload [config|contexts]``. unknown controls return a ``system`` Action with an
|
|
||||||
``("unknown", word)`` arg so the daemon can report it rather than silently drop.
|
|
||||||
"""
|
|
||||||
if not rest:
|
|
||||||
return Action("system", "status")
|
|
||||||
head = rest[0]
|
|
||||||
if _fuzzy_in(head, _RELOAD_VERBS, threshold):
|
|
||||||
inner = _match_reload(rest[1:], threshold, bare_default="all")
|
|
||||||
return Action("system", ("reload", inner.arg))
|
|
||||||
if _fuzzy_in(head, ("status", "state"), threshold):
|
|
||||||
return Action("system", "status")
|
|
||||||
if _fuzzy_in(head, _CLEANUP_VERBS, threshold):
|
|
||||||
return Action("system", "cleanup")
|
|
||||||
return Action("system", ("unknown", head))
|
|
||||||
|
|
||||||
|
|
||||||
def match_command(remainder: str, threshold: float) -> Action | None:
|
def match_command(remainder: str, threshold: float) -> Action | None:
|
||||||
"""map a normalized command remainder to an Action, or None if unrecognized.
|
"""map a normalized command remainder to an Action, or None if unrecognized"""
|
||||||
|
|
||||||
expects the one-shot target and any leading filler to have been stripped already
|
|
||||||
(see parse). a leading ``select``/``option``/etc. is only treated as the select
|
|
||||||
command when followed by a digit; otherwise it is filler handled upstream.
|
|
||||||
"""
|
|
||||||
remainder = remainder.strip()
|
remainder = remainder.strip()
|
||||||
if not remainder:
|
if not remainder:
|
||||||
return None
|
return None
|
||||||
@ -296,116 +112,48 @@ def match_command(remainder: str, threshold: float) -> Action | None:
|
|||||||
head = tokens[0]
|
head = tokens[0]
|
||||||
rest = tokens[1:]
|
rest = tokens[1:]
|
||||||
|
|
||||||
if _fuzzy_in(head, _SYSTEM_VERBS, threshold):
|
|
||||||
return _match_system(rest, threshold)
|
|
||||||
if _fuzzy_in(head, _CLEANUP_VERBS, threshold):
|
|
||||||
return Action("system", "cleanup")
|
|
||||||
if _fuzzy_in(head, _CONFIRM_VERBS, threshold):
|
|
||||||
return Action("system", "confirm")
|
|
||||||
if _fuzzy_in(head, _RELOAD_VERBS, threshold):
|
|
||||||
return _match_reload(rest, threshold, bare_default="all")
|
|
||||||
if _fuzzy_in(head, _CONTEXT_VERBS, threshold) and rest:
|
|
||||||
name = rest[0]
|
|
||||||
dictation = " ".join(rest[1:]).strip()
|
|
||||||
return Action("context", (name, dictation))
|
|
||||||
|
|
||||||
if head in _INDEX_WORDS:
|
if head in _INDEX_WORDS:
|
||||||
return Action("select", _INDEX_WORDS[head])
|
return Action("select", _INDEX_WORDS[head])
|
||||||
|
|
||||||
if _fuzzy_in(head, _YES_VERBS, threshold):
|
if _fuzzy_in(head, ("yes", "yeah", "yep", "yup"), threshold):
|
||||||
return Action("yes")
|
return Action("yes")
|
||||||
if _fuzzy_in(head, _NO_VERBS, threshold):
|
if _fuzzy_in(head, ("no", "nope", "nah"), threshold):
|
||||||
return Action("no")
|
return Action("no")
|
||||||
if _fuzzy_in(head, _APPROVE_VERBS, threshold):
|
if _fuzzy_in(head, ("approve", "allow"), threshold):
|
||||||
return Action("approve")
|
return Action("approve")
|
||||||
if _fuzzy_in(head, _DENY_VERBS, threshold):
|
if _fuzzy_in(head, ("deny", "reject"), threshold):
|
||||||
return Action("deny")
|
return Action("deny")
|
||||||
if _fuzzy_in(head, _SUBMIT_VERBS, threshold):
|
if _fuzzy_in(head, ("send", "enter", "submit"), threshold):
|
||||||
return Action("submit")
|
return Action("submit")
|
||||||
if _fuzzy_in(head, _CANCEL_VERBS, threshold):
|
if _fuzzy_in(head, ("cancel", "escape", "stop"), threshold):
|
||||||
return Action("cancel")
|
return Action("cancel")
|
||||||
|
|
||||||
if _fuzzy_in(head, _SELECT_VERBS, threshold) and rest and rest[0] in _INDEX_WORDS:
|
if _fuzzy_in(head, ("select", "option", "choose", "number"), threshold) and rest:
|
||||||
|
if rest[0] in _INDEX_WORDS:
|
||||||
return Action("select", _INDEX_WORDS[rest[0]])
|
return Action("select", _INDEX_WORDS[rest[0]])
|
||||||
|
|
||||||
if _fuzzy_in(head, _TYPE_VERBS, threshold):
|
if _fuzzy_in(head, ("type", "dictate", "write"), threshold):
|
||||||
text = " ".join(rest).strip()
|
text = " ".join(rest).strip()
|
||||||
return Action("type", text) if text else None
|
return Action("type", text) if text else None
|
||||||
|
|
||||||
if _fuzzy_in(head, _BACKSPACE_VERBS, threshold):
|
if _fuzzy_in(head, ("mode",), threshold) and rest:
|
||||||
return Action("backspace", _leading_count(rest, default=1))
|
|
||||||
if _fuzzy_in(head, _SPACE_VERBS, threshold):
|
|
||||||
return Action("space", _leading_count(rest, default=1))
|
|
||||||
if _fuzzy_in(head, _ADD_VERBS, threshold) and rest:
|
|
||||||
tail = [t for t in rest if t not in ("a", "an")]
|
|
||||||
if any(_fuzzy_in(t, ("space", "spaces"), threshold) for t in tail):
|
|
||||||
count = next((int(t) for t in tail if t.isdigit()),
|
|
||||||
next((_COUNT_WORDS[t] for t in tail if t in _COUNT_WORDS), 1))
|
|
||||||
return Action("space", count)
|
|
||||||
if _fuzzy_in(head, _ERASE_VERBS, threshold):
|
|
||||||
return Action("erase")
|
|
||||||
if _fuzzy_in(head, _DEBUG_VERBS, threshold):
|
|
||||||
return Action("debug", " ".join(rest).strip())
|
|
||||||
|
|
||||||
if _fuzzy_in(head, _MODE_VERBS, threshold) and rest:
|
|
||||||
if _fuzzy_in(rest[0], ("ptt",), threshold) or "push" in rest[0]:
|
if _fuzzy_in(rest[0], ("ptt",), threshold) or "push" in rest[0]:
|
||||||
return Action("mode", "ptt")
|
return Action("mode", "ptt")
|
||||||
if _fuzzy_in(rest[0], ("listen",), threshold):
|
if _fuzzy_in(rest[0], ("listen",), threshold):
|
||||||
return Action("mode", "listen")
|
return Action("mode", "listen")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
if _fuzzy_in(head, _STICKY_VERBS, threshold) and rest:
|
if _fuzzy_in(head, ("switch", "target"), threshold) and rest:
|
||||||
name = "".join(rest)
|
name = "".join(rest)
|
||||||
return Action("set", name) if name else None
|
return Action("switch", name) if name else None
|
||||||
if _fuzzy_in(head, _UNSET_VERBS, threshold) and not rest:
|
|
||||||
return Action("unset")
|
|
||||||
if _fuzzy_in(head, _CUSTOMS_VERBS, threshold):
|
|
||||||
return Action("customs")
|
|
||||||
if _fuzzy_in(head, _COMMANDS_VERBS, threshold):
|
|
||||||
return Action("commands")
|
|
||||||
if _fuzzy_in(head, _LIST_VERBS, threshold):
|
|
||||||
return Action("list")
|
|
||||||
if _fuzzy_in(head, _VERSION_VERBS, threshold):
|
|
||||||
return Action("version")
|
|
||||||
|
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
def _strip_filler(tokens: list[str], filler: tuple[str, ...], threshold: float) -> list[str]:
|
def parse(transcript: str, wake_phrases: list[str], threshold: float,
|
||||||
"""drop leading optional filler words (e.g. select/use/choose) before a command.
|
require_wake: bool) -> Action | None:
|
||||||
|
"""full parse: wake gate then command match. None means discard"""
|
||||||
a filler word that is followed by a digit is NOT dropped — that is the select
|
remainder = strip_wake(transcript, wake_phrases, threshold, require_wake)
|
||||||
command (``select 1``), handled by match_command.
|
|
||||||
"""
|
|
||||||
while tokens and _fuzzy_in(tokens[0], filler, threshold):
|
|
||||||
if len(tokens) > 1 and tokens[1] in _INDEX_WORDS:
|
|
||||||
break
|
|
||||||
tokens = tokens[1:]
|
|
||||||
return tokens
|
|
||||||
|
|
||||||
|
|
||||||
def parse(transcript: str, wake_phrases: list[str], wake_threshold: float,
|
|
||||||
command_threshold: float, require_wake: bool,
|
|
||||||
filler: tuple[str, ...] = DEFAULT_FILLER) -> ParsedCommand | None:
|
|
||||||
"""full parse: wake gate -> optional one-shot target -> filler -> command.
|
|
||||||
|
|
||||||
wake_threshold gates the wake phrase (lenient — a false wake is cheap, it just
|
|
||||||
finds no command); command_threshold gates the command words (stricter — a false
|
|
||||||
command fires the wrong action). returns a ParsedCommand (one_shot, action), or
|
|
||||||
None if the wake gate dropped the utterance (listen mode, no wake phrase). a
|
|
||||||
ParsedCommand with action=None means a wake phrase was present but no command
|
|
||||||
matched.
|
|
||||||
"""
|
|
||||||
remainder, wake = strip_wake_match(transcript, wake_phrases, wake_threshold, require_wake)
|
|
||||||
if remainder is None:
|
if remainder is None:
|
||||||
return None
|
return None
|
||||||
|
return match_command(remainder, threshold)
|
||||||
tokens = remainder.split(" ") if remainder else []
|
|
||||||
one_shot: str | None = None
|
|
||||||
if tokens and _fuzzy_in(tokens[0], _ONESHOT_VERBS, command_threshold) and len(tokens) >= 2:
|
|
||||||
one_shot = tokens[1]
|
|
||||||
tokens = tokens[2:]
|
|
||||||
|
|
||||||
tokens = _strip_filler(tokens, filler, command_threshold)
|
|
||||||
action = match_command(" ".join(tokens), command_threshold)
|
|
||||||
return ParsedCommand(one_shot=one_shot, action=action, wake=wake)
|
|
||||||
|
|||||||
@ -45,18 +45,11 @@ class OutputHandler(ABC):
|
|||||||
def send_literal(self, session: str, text: str) -> None:
|
def send_literal(self, session: str, text: str) -> None:
|
||||||
"""emit literal text into the input box without submitting (``type``)"""
|
"""emit literal text into the input box without submitting (``type``)"""
|
||||||
|
|
||||||
def send_repeat(self, session: str, token: str, count: int) -> None:
|
|
||||||
"""emit a named key `count` times (e.g. BSpace x n). default impl loops."""
|
|
||||||
if count <= 0:
|
|
||||||
return
|
|
||||||
self.send_named(session, [token] * count)
|
|
||||||
|
|
||||||
def perform(self, session: str, action) -> bool:
|
def perform(self, session: str, action) -> bool:
|
||||||
"""resolve a grammar.Action to keystrokes and emit them. returns acted?.
|
"""resolve a grammar.Action to keystrokes and emit them. returns acted?.
|
||||||
|
|
||||||
``switch``/``set``/``mode`` etc. are handled by the daemon (they change daemon
|
``switch`` and ``mode`` are handled by the daemon (they change daemon state,
|
||||||
state, not the claude session), so they are ignored here. ``erase`` arrives
|
not the claude session), so they are ignored here.
|
||||||
with action.arg already set to the count the daemon wants backspaced.
|
|
||||||
"""
|
"""
|
||||||
name = action.name
|
name = action.name
|
||||||
if name == "yes":
|
if name == "yes":
|
||||||
@ -79,10 +72,6 @@ class OutputHandler(ABC):
|
|||||||
self.send_named(session, seq)
|
self.send_named(session, seq)
|
||||||
elif name == "type":
|
elif name == "type":
|
||||||
self.send_literal(session, str(action.arg))
|
self.send_literal(session, str(action.arg))
|
||||||
elif name == "space":
|
|
||||||
self.send_literal(session, " " * int(action.arg))
|
|
||||||
elif name in ("backspace", "erase"):
|
|
||||||
self.send_repeat(session, keys.BACKSPACE[0], int(action.arg))
|
|
||||||
else:
|
else:
|
||||||
return False
|
return False
|
||||||
return True
|
return True
|
||||||
|
|||||||
@ -37,19 +37,6 @@ DENY = ["3"]
|
|||||||
SUBMIT = ["Enter"]
|
SUBMIT = ["Enter"]
|
||||||
CANCEL = ["Escape"]
|
CANCEL = ["Escape"]
|
||||||
|
|
||||||
# NEWLINE is a soft newline inside the input box that does NOT submit — Shift+Enter,
|
|
||||||
# which tmux names ``S-Enter`` (requires the extended-keys / xterm extkeys tmux
|
|
||||||
# settings install.sh appends). used to separate a context blurb from the dictated
|
|
||||||
# instruction in multiline assembly; if it proves flaky the daemon flattens to one
|
|
||||||
# line with a separator instead (behavior.context_multiline = false).
|
|
||||||
NEWLINE = ["S-Enter"]
|
|
||||||
|
|
||||||
# BACKSPACE deletes one char left; SPACE inserts one literal space. both are emitted
|
|
||||||
# repeatedly for `backspace <n>` / `space <n>` and for `erase` (n = the daemon's
|
|
||||||
# tracked uncommitted-input count). BSpace is tmux's name for the backspace key.
|
|
||||||
BACKSPACE = ["BSpace"]
|
|
||||||
SPACE = [" "]
|
|
||||||
|
|
||||||
SELECT_BY_INDEX = {
|
SELECT_BY_INDEX = {
|
||||||
1: SELECT_1,
|
1: SELECT_1,
|
||||||
2: SELECT_2,
|
2: SELECT_2,
|
||||||
|
|||||||
@ -1,91 +0,0 @@
|
|||||||
"""earcons — short confirmation tones on daemon events, the eyes-free feedback layer.
|
|
||||||
|
|
||||||
the single place that maps an event name to its tone file and the per-event enable
|
|
||||||
flag. additive to the console feed (it does not replace the printed lines): at the desk
|
|
||||||
mute tones and read; eyes-free, hear them. playback goes through audio_out (paplay-first,
|
|
||||||
fire-and-forget) so a dead speaker never blocks or breaks a command.
|
|
||||||
|
|
||||||
events:
|
|
||||||
wake — a wake phrase was recognized (off by default — a blip right before you
|
|
||||||
speak the command can bleed into its capture; keep it off unless wanted)
|
|
||||||
accept — a command was recognized/injected
|
|
||||||
no_match — nothing matched, or the target was missing (did nothing)
|
|
||||||
submit — a send/submit was injected
|
|
||||||
|
|
||||||
tone files live in the packaged sounds/ dir; a per-event config override may point at a
|
|
||||||
user file instead. a missing file is swallowed by audio_out (logged once), never raised.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from . import audio_out
|
|
||||||
from .config import Config
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
_SOUNDS_DIR = Path(__file__).resolve().parent / "sounds"
|
|
||||||
|
|
||||||
_EVENT_FILES = {
|
|
||||||
"wake": "wake.wav",
|
|
||||||
"accept": "accepted.wav",
|
|
||||||
"no_match": "no_match.wav",
|
|
||||||
"submit": "sent.wav",
|
|
||||||
}
|
|
||||||
|
|
||||||
_EVENT_FLAGS = {
|
|
||||||
"wake": "on_wake",
|
|
||||||
"accept": "on_accept",
|
|
||||||
"no_match": "on_no_match",
|
|
||||||
"submit": "on_submit",
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class Earcons:
|
|
||||||
"""resolves daemon events to tones and plays them per the [sound] config"""
|
|
||||||
|
|
||||||
def __init__(self, config: Config) -> None:
|
|
||||||
self._apply(config)
|
|
||||||
|
|
||||||
def update(self, config: Config) -> None:
|
|
||||||
"""re-read the [sound] config after a live reload"""
|
|
||||||
self._apply(config)
|
|
||||||
|
|
||||||
def _apply(self, config: Config) -> None:
|
|
||||||
self.enabled = config.sound_enabled
|
|
||||||
self.volume = config.sound_volume
|
|
||||||
self._flags = {
|
|
||||||
"wake": config.sound_on_wake,
|
|
||||||
"accept": config.sound_on_accept,
|
|
||||||
"no_match": config.sound_on_no_match,
|
|
||||||
"submit": config.sound_on_submit,
|
|
||||||
}
|
|
||||||
self._overrides = dict(config.sound_files)
|
|
||||||
|
|
||||||
def _resolve(self, event: str) -> Path | None:
|
|
||||||
override = self._overrides.get(event) or self._overrides.get(_EVENT_FLAGS[event])
|
|
||||||
if override:
|
|
||||||
return Path(override).expanduser()
|
|
||||||
name = _EVENT_FILES.get(event)
|
|
||||||
return _SOUNDS_DIR / name if name else None
|
|
||||||
|
|
||||||
def play(self, event: str) -> None:
|
|
||||||
"""play the tone for an event if enabled (master + per-event). fire-and-forget;
|
|
||||||
unknown/disabled events and missing files are silently no-ops."""
|
|
||||||
if not self.enabled or not self._flags.get(event, False):
|
|
||||||
return
|
|
||||||
path = self._resolve(event)
|
|
||||||
if path is None:
|
|
||||||
return
|
|
||||||
audio_out.play(path, volume=self.volume, blocking=False)
|
|
||||||
|
|
||||||
def tone_path(self, event: str) -> Path | None:
|
|
||||||
"""the resolved tone path for an event (for test-tone), ignoring enable flags"""
|
|
||||||
return self._resolve(event)
|
|
||||||
|
|
||||||
|
|
||||||
def event_names() -> list[str]:
|
|
||||||
"""the earcon event names in a stable order (for test-tone iteration)"""
|
|
||||||
return ["wake", "accept", "no_match", "submit"]
|
|
||||||
@ -1 +0,0 @@
|
|||||||
"""earcon tone assets (committed .wav files) + their generator (generate.py)"""
|
|
||||||
Binary file not shown.
@ -1,75 +0,0 @@
|
|||||||
"""synthetic-beep FALLBACK generator for the earcon .wav tones.
|
|
||||||
|
|
||||||
WARNING: the shipped tones in this directory are now CUSTOM CURATED recordings
|
|
||||||
(edge-trimmed + loudness-normalized to ~-16 dB RMS with a -1 dBTP ceiling), NOT this
|
|
||||||
script's output. running this script OVERWRITES those real tones with plain synthetic
|
|
||||||
beeps — only do so if you deliberately want to fall back to generated placeholders. it
|
|
||||||
is kept as a bootstrap fallback so the package can always self-generate a tone set (the
|
|
||||||
"a missing tone must never break a command" guarantee), not as the source of the
|
|
||||||
committed wavs.
|
|
||||||
|
|
||||||
run ``python -m claudedo.sounds.generate`` (or ``python generate.py`` from this dir) to
|
|
||||||
write placeholder beeps. each is a short, quiet, fade-enveloped sine/triangle at a
|
|
||||||
distinct pitch so the four events are ear-distinguishable:
|
|
||||||
|
|
||||||
wake — soft single mid blip (off by default; least intrusive)
|
|
||||||
accepted — bright single high note (heard you, sent it)
|
|
||||||
no_match — low two-note falling buzz (heard you, but nothing matched / error)
|
|
||||||
sent — two-note rising chime (submitted to claude)
|
|
||||||
|
|
||||||
kept SHORT (<300ms) and quiet (amplitude 0.4) — confirmations, not alarms.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import struct
|
|
||||||
import wave
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
SAMPLE_RATE = 44100
|
|
||||||
AMPLITUDE = 0.4
|
|
||||||
|
|
||||||
HERE = Path(__file__).resolve().parent
|
|
||||||
|
|
||||||
|
|
||||||
def _tone(freq: float, dur: float) -> list[float]:
|
|
||||||
import math
|
|
||||||
|
|
||||||
n = int(SAMPLE_RATE * dur)
|
|
||||||
fade = max(1, int(SAMPLE_RATE * 0.01))
|
|
||||||
out = []
|
|
||||||
for i in range(n):
|
|
||||||
env = min(1.0, i / fade, (n - i) / fade)
|
|
||||||
out.append(math.sin(2.0 * math.pi * freq * (i / SAMPLE_RATE)) * env * AMPLITUDE)
|
|
||||||
return out
|
|
||||||
|
|
||||||
|
|
||||||
def _silence(dur: float) -> list[float]:
|
|
||||||
return [0.0] * int(SAMPLE_RATE * dur)
|
|
||||||
|
|
||||||
|
|
||||||
def _write(name: str, samples: list[float]) -> Path:
|
|
||||||
path = HERE / name
|
|
||||||
with wave.open(str(path), "wb") as wf:
|
|
||||||
wf.setnchannels(1)
|
|
||||||
wf.setsampwidth(2)
|
|
||||||
wf.setframerate(SAMPLE_RATE)
|
|
||||||
clipped = (max(-1.0, min(1.0, s)) for s in samples)
|
|
||||||
wf.writeframes(b"".join(struct.pack("<h", int(s * 32767)) for s in clipped))
|
|
||||||
return path
|
|
||||||
|
|
||||||
|
|
||||||
def generate() -> list[Path]:
|
|
||||||
"""(re)write all earcon wavs; return the written paths"""
|
|
||||||
tones = {
|
|
||||||
"wake.wav": _tone(660.0, 0.12),
|
|
||||||
"accepted.wav": _tone(988.0, 0.14),
|
|
||||||
"no_match.wav": _tone(330.0, 0.10) + _silence(0.03) + _tone(247.0, 0.12),
|
|
||||||
"sent.wav": _tone(784.0, 0.10) + _silence(0.02) + _tone(1175.0, 0.12),
|
|
||||||
}
|
|
||||||
return [_write(name, samples) for name, samples in tones.items()]
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
for p in generate():
|
|
||||||
print(f"wrote {p}")
|
|
||||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
@ -6,106 +6,32 @@ short in-memory chunk; nothing is written to disk or sent anywhere.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import contextlib
|
|
||||||
import logging
|
import logging
|
||||||
import os
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
_NOISE = re.compile(r"GPU device discovery failed|device_discovery\.cc|DiscoverDevicesForPlatform")
|
|
||||||
|
|
||||||
|
|
||||||
def _quiet_backends() -> None:
|
|
||||||
"""quiet onnxruntime/ctranslate2 chatter and the faster_whisper INFO log.
|
|
||||||
|
|
||||||
faster-whisper's VAD loads an onnx model whose device discovery prints a noisy
|
|
||||||
'GPU device discovery failed' warning on headless/WSL hosts with no GPU sysfs.
|
|
||||||
the env var + logger severity stop most onnx logging; the warning itself is
|
|
||||||
emitted at C++ init and is filtered out of stderr by _filter_stderr().
|
|
||||||
"""
|
|
||||||
os.environ.setdefault("ORT_LOGGING_LEVEL", "3")
|
|
||||||
os.environ.setdefault("OMP_NUM_THREADS", os.environ.get("OMP_NUM_THREADS", "4"))
|
|
||||||
logging.getLogger("faster_whisper").setLevel(logging.WARNING)
|
|
||||||
try:
|
|
||||||
import onnxruntime
|
|
||||||
onnxruntime.set_default_logger_severity(3)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
@contextlib.contextmanager
|
|
||||||
def _filter_stderr():
|
|
||||||
"""drop onnxruntime's GPU-discovery warning lines from stderr for this block.
|
|
||||||
|
|
||||||
a pipe temporarily replaces fd 2; a pump thread forwards every line to the real
|
|
||||||
stderr EXCEPT the known GPU-discovery noise, so real errors still surface. the
|
|
||||||
original fd is always restored on exit.
|
|
||||||
"""
|
|
||||||
import threading
|
|
||||||
|
|
||||||
try:
|
|
||||||
stderr_fd = sys.stderr.fileno()
|
|
||||||
except (AttributeError, OSError):
|
|
||||||
yield
|
|
||||||
return
|
|
||||||
|
|
||||||
saved_fd = os.dup(stderr_fd)
|
|
||||||
read_fd, write_fd = os.pipe()
|
|
||||||
os.dup2(write_fd, stderr_fd)
|
|
||||||
os.close(write_fd)
|
|
||||||
|
|
||||||
def pump():
|
|
||||||
with os.fdopen(read_fd, "rb") as reader, os.fdopen(saved_fd, "wb", closefd=False) as out:
|
|
||||||
for line in reader:
|
|
||||||
if not _NOISE.search(line.decode("utf-8", "replace")):
|
|
||||||
out.write(line)
|
|
||||||
out.flush()
|
|
||||||
|
|
||||||
thread = threading.Thread(target=pump, daemon=True)
|
|
||||||
thread.start()
|
|
||||||
try:
|
|
||||||
yield
|
|
||||||
finally:
|
|
||||||
import time
|
|
||||||
|
|
||||||
time.sleep(0.05)
|
|
||||||
os.dup2(saved_fd, stderr_fd)
|
|
||||||
os.close(saved_fd)
|
|
||||||
thread.join(timeout=1.0)
|
|
||||||
|
|
||||||
|
|
||||||
class Transcriber:
|
class Transcriber:
|
||||||
"""a loaded faster-whisper model that transcribes float32 mono audio chunks"""
|
"""a loaded faster-whisper model that transcribes float32 mono audio chunks"""
|
||||||
|
|
||||||
def __init__(self, model: str = "small", language: str = "en", device: str = "auto",
|
def __init__(self, model: str = "small", language: str = "en", device: str = "auto",
|
||||||
compute_type: str = "auto", initial_prompt: str | None = None) -> None:
|
compute_type: str = "auto") -> None:
|
||||||
self.language = language
|
self.language = language
|
||||||
self.initial_prompt = initial_prompt
|
|
||||||
self._model = self._load(model, device, compute_type)
|
self._model = self._load(model, device, compute_type)
|
||||||
self._warm()
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _load(model: str, device: str, compute_type: str):
|
def _load(model: str, device: str, compute_type: str):
|
||||||
|
from faster_whisper import WhisperModel
|
||||||
|
|
||||||
if device == "auto":
|
if device == "auto":
|
||||||
device = "cpu"
|
device = "cpu"
|
||||||
if compute_type == "auto":
|
if compute_type == "auto":
|
||||||
compute_type = "int8" if device == "cpu" else "float16"
|
compute_type = "int8" if device == "cpu" else "float16"
|
||||||
log.info("loading faster-whisper model=%s device=%s compute=%s", model, device, compute_type)
|
log.info("loading faster-whisper model=%s device=%s compute=%s", model, device, compute_type)
|
||||||
with _filter_stderr():
|
|
||||||
_quiet_backends()
|
|
||||||
from faster_whisper import WhisperModel
|
|
||||||
return WhisperModel(model, device=device, compute_type=compute_type)
|
return WhisperModel(model, device=device, compute_type=compute_type)
|
||||||
|
|
||||||
def _warm(self) -> None:
|
|
||||||
"""run one throwaway transcribe so the VAD onnx session inits now, under the
|
|
||||||
stderr filter — the GPU-discovery warning fires here once, not in the loop"""
|
|
||||||
with _filter_stderr():
|
|
||||||
list(self._model.transcribe(np.zeros(1600, dtype=np.float32), vad_filter=True)[0])
|
|
||||||
|
|
||||||
def transcribe(self, audio: np.ndarray, samplerate: int = 16000) -> str:
|
def transcribe(self, audio: np.ndarray, samplerate: int = 16000) -> str:
|
||||||
"""transcribe a mono float32 numpy array to a stripped text string.
|
"""transcribe a mono float32 numpy array to a stripped text string.
|
||||||
|
|
||||||
@ -121,7 +47,6 @@ class Transcriber:
|
|||||||
beam_size=1,
|
beam_size=1,
|
||||||
vad_filter=True,
|
vad_filter=True,
|
||||||
condition_on_previous_text=False,
|
condition_on_previous_text=False,
|
||||||
initial_prompt=self.initial_prompt,
|
|
||||||
)
|
)
|
||||||
text = " ".join(seg.text for seg in segments).strip()
|
text = " ".join(seg.text for seg in segments).strip()
|
||||||
return text
|
return text
|
||||||
|
|||||||
@ -18,9 +18,8 @@ def session_name(name: str) -> str:
|
|||||||
|
|
||||||
single source of truth for the name->session mapping. the shell cc kit
|
single source of truth for the name->session mapping. the shell cc kit
|
||||||
(~/.config/claudedo/cc.sh) mirrors this exactly, so ``cc libs`` and the voice
|
(~/.config/claudedo/cc.sh) mirrors this exactly, so ``cc libs`` and the voice
|
||||||
commands ``set libs`` (sticky) / ``target libs`` (one-shot) all resolve to
|
commands ``switch libs`` / ``target libs`` all resolve to ``claude-libs``. an
|
||||||
``claude-libs``. an already-prefixed name is returned unchanged so callers can
|
already-prefixed name is returned unchanged so callers can pass either form.
|
||||||
pass either form.
|
|
||||||
"""
|
"""
|
||||||
name = name.strip()
|
name = name.strip()
|
||||||
return name if name.startswith(SESSION_PREFIX) else f"{SESSION_PREFIX}{name}"
|
return name if name.startswith(SESSION_PREFIX) else f"{SESSION_PREFIX}{name}"
|
||||||
@ -39,28 +38,18 @@ def read_active() -> str | None:
|
|||||||
|
|
||||||
|
|
||||||
def write_active(name: str) -> None:
|
def write_active(name: str) -> None:
|
||||||
"""overwrite ~/.claude-active with a session name (the sticky default)"""
|
"""overwrite ~/.claude-active with a session name (used by ``switch``)"""
|
||||||
ACTIVE_FILE.write_text(name + "\n", encoding="utf-8")
|
ACTIVE_FILE.write_text(name + "\n", encoding="utf-8")
|
||||||
|
|
||||||
|
|
||||||
def set_target(name: str) -> str:
|
def set_target(name: str) -> str:
|
||||||
"""map a project short-name via session_name() and persist it as the sticky
|
"""map a project short-name via session_name() and persist it. returns the
|
||||||
default. returns the resolved session name."""
|
resolved session name."""
|
||||||
session = session_name(name)
|
session = session_name(name)
|
||||||
write_active(session)
|
write_active(session)
|
||||||
return session
|
return session
|
||||||
|
|
||||||
|
|
||||||
def unset_target() -> None:
|
|
||||||
"""clear the sticky default (empty/remove ~/.claude-active)"""
|
|
||||||
try:
|
|
||||||
ACTIVE_FILE.unlink()
|
|
||||||
except FileNotFoundError:
|
|
||||||
pass
|
|
||||||
except OSError as exc:
|
|
||||||
log.warning("could not clear %s: %s", ACTIVE_FILE, exc)
|
|
||||||
|
|
||||||
|
|
||||||
def session_exists(name: str) -> bool:
|
def session_exists(name: str) -> bool:
|
||||||
"""true if a tmux session with this name currently exists"""
|
"""true if a tmux session with this name currently exists"""
|
||||||
if not name:
|
if not name:
|
||||||
@ -73,91 +62,24 @@ def session_exists(name: str) -> bool:
|
|||||||
return result.returncode == 0
|
return result.returncode == 0
|
||||||
|
|
||||||
|
|
||||||
def list_sessions() -> list[str]:
|
def resolve_target() -> str | None:
|
||||||
"""return the names of all running claude-* tmux sessions (sorted)"""
|
"""return the active session name only if it exists; else log and return None.
|
||||||
return sorted(name for name, _attached in _claude_sessions())
|
|
||||||
|
|
||||||
|
never guesses a target: on a missing/empty ~/.claude-active or a stale session
|
||||||
|
name, this logs a clear warning and returns None so the caller injects nothing.
|
||||||
|
|
||||||
def _claude_sessions() -> list[tuple[str, bool]]:
|
TODO: most-recently-active targeting (preferred over attached). today the target
|
||||||
"""the single tmux query for claude-* sessions: (name, attached) pairs.
|
is the project most recently ATTACHED to (the cc kit writes ~/.claude-active on
|
||||||
|
attach); upgrade to the session claude most recently asked a question in, via
|
||||||
one source of truth for session enumeration — list_sessions() and the detached
|
tmux session_activity timestamps (list-sessions -F '#{session_name}
|
||||||
cleanup both build on this. attached is True when at least one client is attached
|
#{session_activity}', pick the highest-activity claude-* session) or by scraping
|
||||||
(tmux #{session_attached} > 0). returns [] if tmux isn't reachable.
|
panes (capture-pane) for a waiting-prompt UI.
|
||||||
"""
|
"""
|
||||||
result = subprocess.run(
|
name = read_active()
|
||||||
["tmux", "list-sessions", "-F", "#{session_name} #{session_attached}"],
|
if not name:
|
||||||
stdout=subprocess.PIPE, stderr=subprocess.DEVNULL,
|
log.warning("no active session set (%s missing/empty) — run `cc` to attach", ACTIVE_FILE)
|
||||||
)
|
return None
|
||||||
if result.returncode != 0:
|
if not session_exists(name):
|
||||||
return []
|
log.warning("target session %r no longer exists — skipping injection", name)
|
||||||
out: list[tuple[str, bool]] = []
|
return None
|
||||||
for line in result.stdout.decode("utf-8", "replace").splitlines():
|
return name
|
||||||
parts = line.rsplit(" ", 1)
|
|
||||||
if len(parts) != 2:
|
|
||||||
continue
|
|
||||||
name, attached = parts
|
|
||||||
if name.startswith(SESSION_PREFIX):
|
|
||||||
out.append((name, attached.strip() != "0"))
|
|
||||||
return out
|
|
||||||
|
|
||||||
|
|
||||||
def cleanup_detached() -> tuple[list[str], list[str]]:
|
|
||||||
"""kill every DETACHED claude-* session, never an attached one. returns the
|
|
||||||
(killed, kept_attached) name lists (both sorted) for reporting.
|
|
||||||
|
|
||||||
detached-only is the safety model: a misheard voice ``cleanup`` cannot nuke the
|
|
||||||
active session, which is attached. the kill-including-attached path stays the shell
|
|
||||||
``cckl`` (deliberate, typed).
|
|
||||||
"""
|
|
||||||
killed: list[str] = []
|
|
||||||
kept: list[str] = []
|
|
||||||
for name, attached in _claude_sessions():
|
|
||||||
if attached:
|
|
||||||
kept.append(name)
|
|
||||||
continue
|
|
||||||
result = subprocess.run(
|
|
||||||
["tmux", "kill-session", "-t", name],
|
|
||||||
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL,
|
|
||||||
)
|
|
||||||
if result.returncode == 0:
|
|
||||||
killed.append(name)
|
|
||||||
return sorted(killed), sorted(kept)
|
|
||||||
|
|
||||||
|
|
||||||
def resolve(one_shot: str | None = None, auto_target: bool = False) -> tuple[str | None, str]:
|
|
||||||
"""resolve the destination session and a short reason describing the choice.
|
|
||||||
|
|
||||||
single source of truth for targeting, used by both the voice and CLI paths.
|
|
||||||
returns (session_or_None, reason). a None session means inject nothing; the
|
|
||||||
reason explains why (for the daemon console / CLI message). resolution order:
|
|
||||||
|
|
||||||
1. one-shot present -> claude-<name> for THIS command only; never falls through
|
|
||||||
to a different session if it doesn't exist (explicit beats convenience).
|
|
||||||
2. sticky set + exists -> use it.
|
|
||||||
3. nothing sticky, exactly one claude-* session:
|
|
||||||
auto_target=True -> auto-use it;
|
|
||||||
auto_target=False -> require an explicit set/target, do nothing.
|
|
||||||
4. nothing sticky, multiple sessions -> ambiguous, do nothing.
|
|
||||||
5. nothing sticky, zero sessions -> do nothing.
|
|
||||||
"""
|
|
||||||
if one_shot is not None:
|
|
||||||
session = session_name(one_shot)
|
|
||||||
if session_exists(session):
|
|
||||||
return session, f"one-shot {session}"
|
|
||||||
return None, f"one-shot {session} does not exist (did nothing)"
|
|
||||||
|
|
||||||
sticky = read_active()
|
|
||||||
if sticky:
|
|
||||||
if session_exists(sticky):
|
|
||||||
return sticky, f"sticky {sticky}"
|
|
||||||
return None, f"sticky {sticky} no longer exists (set one)"
|
|
||||||
|
|
||||||
sessions = list_sessions()
|
|
||||||
if len(sessions) == 1:
|
|
||||||
if auto_target:
|
|
||||||
return sessions[0], f"auto-target {sessions[0]} (only session)"
|
|
||||||
return None, f"no target set ({sessions[0]} running — set one)"
|
|
||||||
if len(sessions) > 1:
|
|
||||||
return None, f"no target set, {len(sessions)} sessions (set one)"
|
|
||||||
return None, "no claude sessions"
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user