Compare commits

..

No commits in common. "1a593b95fa5b1d0fb530be9b491e1d33f844b321" and "f177b46a4b57e7c3a7a6bb2567d31733901418d5" have entirely different histories.

20 changed files with 14 additions and 862 deletions

View File

@ -21,7 +21,7 @@ mic (WSLg/PulseAudio RDPSource)
-> faster-whisper (local STT, on-device)
-> wake gate: utterance must start with a wake phrase, else DISCARD locally
-> grammar match (yes/no/one..four/approve/deny/send/type/space/backspace/erase/
mode/set/target/unset/list/context/reload/system/cancel)
mode/set/target/unset/list/cancel)
-> resolve target session (one-shot > sticky ~/.claude-active > auto/none)
-> tmux send-keys -t <session> "<keys>"
-> log the action to the watched terminal ([session]/[SYSTEM]/[VOICE], colored)
@ -79,12 +79,10 @@ claudedo start --check # run a mic check before listening
claudedo start --mode ptt # push-to-talk instead (desk-only — see Modes)
claudedo status # running? mode? target session?
claudedo stop # stop a running daemon
claudedo reload # reload config.toml + contexts.toml in a running daemon
claudedo set <name> # set the sticky target -> claude-<name> (alias: switch)
claudedo unset # clear the sticky target
claudedo list # list running claude-* sessions
claudedo test-audio # verify the mic capture path
claudedo test-tone # play each earcon (verify the audio-OUT path)
```
### Modes
@ -129,12 +127,8 @@ said "okay clouds"), the heard line notes which phrase it assumed —
| `target <name> <command>` | **one-shot** override: run that command on `claude-<name>` for this utterance only; sticky default unchanged |
| `unset` (alias `unsticky`) | clear the sticky target |
| `list` | list running `claude-*` sessions to the daemon console |
| `context <name> <instruction>` (alias `prepare`) | inject a `contexts.toml` blurb as a preamble + the dictated instruction, then **wait** (no submit — say "send") |
| `reload` | re-read `config.toml` + `contexts.toml` live (no daemon restart, model stays loaded) |
| `system status` | print mode / target / model / context count to the console (daemon-control; never injects) |
| `system reload [config\|contexts]` | reload one or both config files |
| `commands` (alias `help`/`menu`) | print the voice-command menu to the console |
| `customs` (alias `custom`) | list the loaded context names |
| `customs` (alias `custom`) | custom commands — arriving in v0.2.0 (stub for now) |
| `version` | print the claudedo version to the console |
| `cancel` / `escape` | back out of a prompt |
@ -178,70 +172,6 @@ cck <name> # kill claude-<name>
cckl # kill all claude-* sessions
```
## Contexts (named reference blurbs)
`contexts.toml` holds named reference snippets you can inject ahead of a dictated
instruction with the **`context <name> <instruction>`** voice command (alias
`prepare`). It lives next to `config.toml`
(`$CLAUDEDO_CONTEXTS` → `~/.config/claudedo/contexts.toml``./contexts.toml`); a
missing file just means no contexts (the feature is opt-in).
```toml
[contexts]
webhooks = "discord webhooks — test: <url> (safe to spam), live: <url> (real, careful)"
testing = "use the test/staging resources only, never touch prod"
```
Saying `context webhooks send a test message` injects the `webhooks` blurb as a
preamble, then the dictated instruction, and **waits** — nothing is auto-submitted. You
say `send` to submit (**read-before-send**; Claude's own permission prompt is the
backstop for anything consequential). A bare `context webhooks` injects just the blurb.
One context per command (no stacking yet); an unknown name announces and injects
nothing.
Names are **spoken and fuzzy-matched**, so keep them simple and distinct — they're
looked up on a despaced/lowercased key, so `web hooks` / `web-hooks` / `webhooks` all
resolve the same block. Assembly is config-gated: `behavior.context_multiline` (default
`true`) puts the blurb and instruction on separate lines via a Shift+Enter soft newline;
set it `false` to flatten onto one line with `context_separator` (default `" — "`) if
Shift+Enter is unreliable in your terminal.
Edit `contexts.toml`, then say **`reload`** (or run `claudedo reload`) — it re-reads
`config.toml` and `contexts.toml` live without restarting the daemon or reloading the
Whisper model. The **`system`** namespace gives daemon-control by voice without touching
Claude: `system status` (mode / target / model / context count) and `system reload
[config|contexts]`.
## Earcons (audio feedback tones)
Short confirmation tones play on key events so you get **eyes-free feedback** — "did it
hear me?" — without watching the terminal. They're tones, not speech (not TTS): a bright
blip when a command is accepted/injected, a low buzz when nothing matched, a rising chime
on submit, and an optional blip on wake. Tones are short (<300ms) and quiet, and they're
**additive** to the console feed — mute them and read at the desk, or hear them eyes-free.
Verify the audio-OUT path (the reverse of `test-audio`, and the less-tested direction on
WSLg) with:
```bash
claudedo test-tone # plays each tone through WSLg — the audio-out gate
```
Tones play through WSLg's PulseAudio sink, **paplay-first** (a separate process, so it
doesn't contend with the sounddevice mic stream), falling back to in-process sounddevice,
then `powershell.exe` on the Windows host. Playback is **fire-and-forget**: a dead speaker
or a missing tone file logs once and is ignored — audio-out can never block or break a
command (`claudedo yes` injects whether or not the speaker works).
Configure under `[sound]`: `enabled` (master, default on), per-event `on_wake` (default
**off** — a blip right before you speak can bleed into the command capture, and it's
chatty), `on_accept` / `on_no_match` / `on_submit` (default on), and `volume` (0.01.0,
best-effort — scaled for sounddevice, `--volume` for paplay, ignored by the PowerShell
fallback). A `[sound.files]` table can point any event at your own `.wav`. The shipped
tones live in the package (`claudedo/sounds/*.wav`); `claudedo/sounds/generate.py` is a
synthetic-beep fallback that can regenerate a placeholder set (it does **not** reproduce
the shipped tones — running it overwrites them with plain beeps).
## The confirmed Claude Code keymap
The keystrokes in [`keys.py`](src/claudedo/keys.py) were confirmed **empirically**
@ -288,9 +218,6 @@ it searches
`false` does nothing and asks you to `set`; `true` auto-uses that session.
- **`print_heard`** (default `false`, debug): prints non-wake transcripts so you can
see how Whisper renders your wake word, then tune the wake list/threshold.
- **`context_multiline`** (default `true`) / **`context_separator`** (default `" — "`):
how the `context` command assembles the blurb and instruction — a Shift+Enter soft
newline between them, or (when `false`) flattened onto one line with the separator.
## Requirements

View File

@ -78,37 +78,3 @@ auto_target = false
# how Whisper renders your wake word, then turn it OFF. default false: non-wake speech
# is discarded without ever printing the transcript.
print_heard = false
# how the `context <name> <dictation>` command assembles the blurb + instruction.
# true (default): blurb, a soft newline (Shift+Enter — needs the extended-keys tmux
# settings install.sh appends), then the instruction. if Shift+Enter is at all flaky
# in your terminal (it submits or does nothing), set false to flatten onto one line
# with context_separator between blurb and instruction — the blank line is cosmetic,
# not worth a submit risk. either way the assembled text is NEVER auto-submitted.
context_multiline = true
# separator inserted between blurb and instruction when context_multiline = false.
context_separator = " — "
[sound]
# earcons — short confirmation tones on daemon events so you get eyes-free feedback
# ("did it hear me?") without watching the terminal. tones are SHORT (<300ms) and quiet;
# they play OUT through WSLg's PulseAudio sink (paplay-first, sounddevice fallback, then
# powershell.exe). additive to the console feed — mute these and read at the desk, or
# hear them eyes-free. a dead speaker never blocks/breaks a command (fire-and-forget).
enabled = true
# blip when a wake phrase is recognized. OFF by default: a blip right before you speak
# the command can bleed into its capture, and it's chatty. turn on only if you want it.
on_wake = false
# positive blip when a command is recognized/injected.
on_accept = true
# distinct lower buzz when nothing matched or the target was missing (did nothing).
on_no_match = true
# rising chime when a send/submit is injected.
on_submit = true
# best-effort 0.0-1.0 (scaled for sounddevice, --volume for paplay; ignored by the
# powershell fallback, which has no volume control).
volume = 0.5
# optional per-event overrides to swap in your own .wav files, e.g.:
# [sound.files]
# accept = "~/sounds/my_accept.wav"
[sound.files]

View File

@ -1,18 +0,0 @@
# claudedo contexts — named reference blurbs you can inject ahead of a dictated
# instruction with the `context <name> <instruction>` voice command (alias `prepare`).
#
# the named blurb is injected as a preamble, then your dictated instruction, and the
# daemon WAITS — nothing is auto-submitted. you say "send" to submit (read-before-send;
# claude's own permission prompt is the backstop for anything consequential).
#
# names are SPOKEN and fuzzy-matched, so keep them simple, distinct, single words
# (a-z, 0-9; spaces/hyphens/underscores are stripped for matching, so "web hooks",
# "web-hooks" and "webhooks" all resolve the same block). values are free-form text.
#
# edit this file, then say "reload" (or run `claudedo reload`) — no daemon restart,
# the whisper model is not reloaded.
[contexts]
webhooks = "discord webhooks — test: <url> (safe to spam), live: <url> (real, careful)"
testing = "use the test/staging resources only, never touch prod"
discord = "discord.py 2.x; bot token in .env as BOT_TOKEN; guild id 12345"

View File

@ -57,14 +57,9 @@ say "verifying audio path"
if pactl info >/dev/null 2>&1; then
DEFAULT_SRC="$(pactl info | sed -n 's/^Default Source: //p')"
echo " Default Source: ${DEFAULT_SRC:-<none>}"
DEFAULT_SINK="$(pactl info | sed -n 's/^Default Sink: //p')"
echo " Default Sink: ${DEFAULT_SINK:-<none>}"
if ! pactl list sources short 2>/dev/null | grep -q RDPSource; then
warn "RDPSource not listed by pactl — mic may not be bridged. check Windows mic permission."
fi
if ! pactl list sinks short 2>/dev/null | grep -q RDPSink; then
warn "RDPSink not listed by pactl — earcon/TTS audio-OUT may not play. run 'claudedo test-tone' to check."
fi
else
warn "pactl info failed — pulseaudio-utils installed but no server reachable yet."
fi
@ -111,18 +106,6 @@ else
echo " $CONF_DIR/config.toml already current"
fi
# install the contexts.toml template (named blurbs for the `context` voice command).
# same policy: copy only if absent, else drop a .new — never clobber edited contexts.
if [ ! -f "$CONF_DIR/contexts.toml" ]; then
install -m 0644 "$REPO_DIR/contexts.toml" "$CONF_DIR/contexts.toml"
echo " wrote $CONF_DIR/contexts.toml"
elif ! cmp -s "$REPO_DIR/contexts.toml" "$CONF_DIR/contexts.toml"; then
install -m 0644 "$REPO_DIR/contexts.toml" "$CONF_DIR/contexts.toml.new"
echo " kept your $CONF_DIR/contexts.toml; new default written to contexts.toml.new (diff to merge)"
else
echo " $CONF_DIR/contexts.toml already current"
fi
# wire EVERY rc that exists (the user may have both zsh and bash).
wired_any=0
for RC in "$HOME/.zshrc" "$HOME/.bashrc"; do

View File

@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "claudedo"
version = "0.2.1"
version = "0.1.4"
description = "voice-control daemon for claude code (local STT -> tmux send-keys)"
readme = "README.md"
requires-python = ">=3.10"
@ -23,9 +23,6 @@ claudedo = "claudedo.__main__:main"
[tool.setuptools]
package-dir = { "" = "src" }
[tool.setuptools.package-data]
"claudedo.sounds" = ["*.wav"]
[tool.setuptools.packages.find]
where = ["src"]

View File

@ -1,3 +1,3 @@
"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)"""
__version__ = "0.2.1"
__version__ = "0.1.4"

View File

@ -97,44 +97,6 @@ def cmd_stop(_args: argparse.Namespace) -> int:
return 1
def cmd_test_tone(args: argparse.Namespace) -> int:
config = _load_or_die(args.config)
from . import audio_out, sound
print("== claudedo test-tone ==")
if not audio_out.available():
print("no audio-out backend found (paplay / powershell.exe).", file=sys.stderr)
print("install pulseaudio-utils (run install.sh) for paplay.", file=sys.stderr)
return 1
earcons = sound.Earcons(config)
print(f"playing each tone via WSLg audio-out (volume {config.sound_volume}) — listen ...")
ok = True
for event in sound.event_names():
path = earcons.tone_path(event)
if path is None or not Path(path).is_file():
print(f" {event:9} MISSING ({path})")
ok = False
continue
print(f" {event:9} {path.name} ...", flush=True)
played = audio_out.play_blocking(path, volume=config.sound_volume)
if not played:
print(f" {event:9} FAILED to play", file=sys.stderr)
ok = False
if not ok:
print("some tones did not play — audio-out may be unavailable.", file=sys.stderr)
return 1
print("audio-out OK (all tones played).")
return 0
def cmd_reload(_args: argparse.Namespace) -> int:
if daemon.reload_running():
print("signalled claudedo to reload config + contexts")
return 0
print("claudedo is not running")
return 1
def cmd_status(_args: argparse.Namespace) -> int:
pid = daemon.read_pid()
if pid is None:
@ -260,12 +222,8 @@ def build_parser() -> argparse.ArgumentParser:
sp.set_defaults(func=cmd_start)
sub.add_parser("stop", help="stop a running daemon").set_defaults(func=cmd_stop)
sub.add_parser("reload", help="reload config + contexts in a running daemon"
).set_defaults(func=cmd_reload)
sub.add_parser("status", help="show daemon status").set_defaults(func=cmd_status)
sub.add_parser("test-audio", help="verify the mic capture path").set_defaults(func=cmd_test_audio)
sub.add_parser("test-tone", help="play each earcon (verify the audio-out path)"
).set_defaults(func=cmd_test_tone)
sub.add_parser("install", help="re-run the bootstrap (install.sh)").set_defaults(func=cmd_install)
sub.add_parser("unset", help="clear the sticky target session").set_defaults(func=cmd_unset)
sub.add_parser("list", help="list running claude-* sessions").set_defaults(func=cmd_list)

View File

@ -1,158 +0,0 @@
"""audio output — play short .wav files through the WSLg/PulseAudio sink (RDPSink).
the reverse direction of audio.py's mic capture, and the less-tested path on WSLg. a
three-tier player picks the first backend that works and remembers it:
1. paplay (pulseaudio-utils) a SEPARATE process hitting PulseAudio directly. this
is the primary on purpose: the daemon captures with sounddevice (an open input
stream in listen mode), so keeping OUTPUT in a separate process avoids stacking
input+output in one lib on a bridge known to be duplex-flaky.
2. sounddevice sd.play() in-process fallback if paplay is absent.
3. powershell.exe SoundPlayer last resort via the Windows host (no volume control).
both earcons (sound.py) and future v0.3 TTS readback play through this module keep it
generic (it plays a wav path, it knows nothing about events). playback is fire-and-
forget on a worker thread: a missing file or a dead speaker logs once and is swallowed,
never raised, so audio-out can NEVER block or break the inject path.
"""
from __future__ import annotations
import logging
import shutil
import subprocess
import threading
import wave
from pathlib import Path
log = logging.getLogger(__name__)
_PAPLAY = "paplay"
_POWERSHELL = "powershell.exe"
_backend_lock = threading.Lock()
_chosen_backend: str | None = None
_warned = False
def _have(cmd: str) -> bool:
return shutil.which(cmd) is not None
def _clamp_volume(volume: float) -> float:
return max(0.0, min(1.0, float(volume)))
def _play_paplay(path: Path, volume: float) -> bool:
"""play via paplay; volume scaled through --volume (0-65536 linear)"""
vol = int(_clamp_volume(volume) * 65536)
proc = subprocess.run(
[_PAPLAY, f"--volume={vol}", str(path)],
stdout=subprocess.DEVNULL, stderr=subprocess.PIPE,
)
if proc.returncode != 0:
log.debug("paplay failed: %s", proc.stderr.decode("utf-8", "replace").strip())
return False
return True
def _play_sounddevice(path: Path, volume: float) -> bool:
"""play via sounddevice (in-process fallback); volume scales the samples"""
try:
import numpy as np
import sounddevice as sd
except Exception as exc:
log.debug("sounddevice unavailable: %s", exc)
return False
try:
with wave.open(str(path), "rb") as wf:
sr = wf.getframerate()
frames = wf.readframes(wf.getnframes())
data = np.frombuffer(frames, dtype="<i2").astype(np.float32) / 32768.0
data = data * _clamp_volume(volume)
sd.play(data, sr)
sd.wait()
return True
except Exception as exc:
log.debug("sounddevice playback failed: %s", exc)
return False
def _play_powershell(path: Path, _volume: float) -> bool:
"""play via the Windows host (last resort). SoundPlayer has no volume control,
so volume is ignored on this backend (documented best-effort)."""
if not _have(_POWERSHELL):
return False
try:
win = subprocess.run(["wslpath", "-w", str(path)], stdout=subprocess.PIPE,
stderr=subprocess.DEVNULL)
winpath = win.stdout.decode("utf-8", "replace").strip() if win.returncode == 0 else str(path)
script = f"(New-Object Media.SoundPlayer '{winpath}').PlaySync()"
proc = subprocess.run([_POWERSHELL, "-NoProfile", "-Command", script],
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
return proc.returncode == 0
except Exception as exc:
log.debug("powershell playback failed: %s", exc)
return False
_BACKENDS = {
"paplay": _play_paplay,
"sounddevice": _play_sounddevice,
"powershell": _play_powershell,
}
_ORDER = ("paplay", "sounddevice", "powershell")
def _play_sync(path: Path, volume: float) -> bool:
"""play a wav synchronously, choosing/remembering a working backend. returns
whether playback succeeded; never raises."""
global _chosen_backend, _warned
if not path.is_file():
log.debug("tone file missing: %s", path)
return False
with _backend_lock:
order = (_chosen_backend,) + _ORDER if _chosen_backend else _ORDER
tried = []
for name in order:
if name in tried:
continue
tried.append(name)
if name == "paplay" and not _have(_PAPLAY):
continue
if _BACKENDS[name](path, volume):
with _backend_lock:
_chosen_backend = name
return True
with _backend_lock:
if not _warned:
_warned = True
log.warning("audio-out unavailable (tried %s) — continuing silently; "
"tones disabled for this run", ", ".join(tried))
return False
def play(path: str | Path, volume: float = 1.0, blocking: bool = False) -> None:
"""play a wav file. fire-and-forget by default (a worker thread), so a slow or
dead speaker never delays the caller. set blocking=True only for test-tone, where
we want to play tones in sequence and report the result.
failures are swallowed (logged once) audio-out must never break a command.
"""
p = Path(path)
if blocking:
_play_sync(p, volume)
return
threading.Thread(target=_play_sync, args=(p, volume), daemon=True).start()
def play_blocking(path: str | Path, volume: float = 1.0) -> bool:
"""synchronous play that returns success — for test-tone's audio-out gate"""
return _play_sync(Path(path), volume)
def available() -> bool:
"""true if any audio-out backend is present (best-effort, paplay/powershell)"""
return _have(_PAPLAY) or _have(_POWERSHELL)

View File

@ -56,15 +56,6 @@ class Config:
filler_words: tuple[str, ...]
auto_target: bool
print_heard: bool
context_multiline: bool
context_separator: str
sound_enabled: bool
sound_on_wake: bool
sound_on_accept: bool
sound_on_no_match: bool
sound_on_submit: bool
sound_volume: float
sound_files: dict[str, str]
source_path: Path | None = field(default=None)
@ -137,23 +128,12 @@ def load_config(explicit: str | os.PathLike | None = None) -> Config:
["select", "use", "choose"])),
auto_target=bool(_require(raw, "behavior", "auto_target", (bool,), False)),
print_heard=bool(_require(raw, "behavior", "print_heard", (bool,), False)),
context_multiline=bool(_require(raw, "behavior", "context_multiline", (bool,), True)),
context_separator=str(_require(raw, "behavior", "context_separator", (str,), "")),
sound_enabled=bool(_require(raw, "sound", "enabled", (bool,), True)),
sound_on_wake=bool(_require(raw, "sound", "on_wake", (bool,), False)),
sound_on_accept=bool(_require(raw, "sound", "on_accept", (bool,), True)),
sound_on_no_match=bool(_require(raw, "sound", "on_no_match", (bool,), True)),
sound_on_submit=bool(_require(raw, "sound", "on_submit", (bool,), True)),
sound_volume=float(_require(raw, "sound", "volume", (int, float), 0.5)),
sound_files=dict(_require(raw, "sound", "files", (dict,), {})),
source_path=path,
)
for label, val in (("wake_fuzzy_threshold", cfg.wake_fuzzy_threshold),
("command_fuzzy_threshold", cfg.command_fuzzy_threshold)):
if not 0.0 < val <= 1.0:
raise ConfigError(f"[behavior].{label} must be in (0, 1]")
if not 0.0 <= cfg.sound_volume <= 1.0:
raise ConfigError("[sound].volume must be in [0, 1]")
if cfg.vad_silence_ms <= 0 or cfg.vad_max_seconds <= 0:
raise ConfigError("[vad].silence_ms and max_seconds must be positive")
if cfg.samplerate <= 0 or cfg.channels <= 0:

View File

@ -1,108 +0,0 @@
"""load named context blocks from contexts.toml into a typed lookup.
contexts are user-edited reference blurbs (claude.md-style snippets) keyed by simple
spoken names. the ``context``/``prepare`` voice command injects a named blurb ahead of
a dictated instruction (read-before-send: never auto-submitted). mirrors config.py's
load/validate pattern; a missing file is an empty set, not an error.
"""
from __future__ import annotations
import logging
import os
import re
from dataclasses import dataclass, field
from pathlib import Path
try:
import tomllib as _toml
except ModuleNotFoundError:
import tomli as _toml
log = logging.getLogger(__name__)
_NAME_RE = re.compile(r"^[a-z0-9][a-z0-9 _-]*$")
DEFAULT_CONTEXTS_PATHS = (
Path(os.environ.get("CLAUDEDO_CONTEXTS", "")) if os.environ.get("CLAUDEDO_CONTEXTS") else None,
Path.home() / ".config" / "claudedo" / "contexts.toml",
Path.cwd() / "contexts.toml",
)
class ContextsError(Exception):
"""raised on an unparseable or invalid contexts.toml"""
@dataclass
class Contexts:
"""validated named context blocks (name -> blurb), normalized for spoken lookup"""
blocks: dict[str, str] = field(default_factory=dict)
source_path: Path | None = field(default=None)
def __len__(self) -> int:
return len(self.blocks)
def names(self) -> list[str]:
"""the context names, sorted (for status / listing)"""
return sorted(self.blocks)
def get(self, name: str) -> str | None:
"""look up a blurb by its normalized (lowercased, despaced) name, or None.
names are matched on a lowercase, space/underscore/hyphen-stripped key so a
spoken "web hooks" resolves the configured ``webhooks``/``web-hooks`` block.
"""
return self.blocks.get(_key(name))
def _key(name: str) -> str:
return re.sub(r"[ _-]+", "", name.strip().lower())
def find_contexts_path(explicit: str | os.PathLike | None = None) -> Path | None:
"""resolve the contexts.toml path, or None if no file exists (not an error)"""
candidates: list[Path] = []
if explicit:
candidates.append(Path(explicit))
candidates.extend(p for p in DEFAULT_CONTEXTS_PATHS if p)
for path in candidates:
if path.is_file():
return path
return None
def load_contexts(explicit: str | os.PathLike | None = None) -> Contexts:
"""load contexts.toml from the first existing default path (or an explicit one).
a missing file yields an empty Contexts (the feature is opt-in). names must be
simple words (matchable) and values must be non-empty strings; a bad entry raises
ContextsError so the user sees a clear message rather than a silent drop.
"""
path = find_contexts_path(explicit)
if path is None:
return Contexts(blocks={}, source_path=None)
try:
with open(path, "rb") as fh:
raw = _toml.load(fh)
except _toml.TOMLDecodeError as exc:
raise ContextsError(f"could not parse {path}: {exc}") from exc
table = raw.get("contexts", {})
if not isinstance(table, dict):
raise ContextsError("[contexts] must be a table of name = \"blurb\" entries")
blocks: dict[str, str] = {}
for name, value in table.items():
if not isinstance(name, str) or not _NAME_RE.match(name.lower()):
raise ContextsError(f"context name {name!r} must be simple words (a-z, 0-9, space/-/_)")
if not isinstance(value, str) or not value.strip():
raise ContextsError(f"context {name!r} must be a non-empty string")
key = _key(name)
if key in blocks:
raise ContextsError(f"context {name!r} collides with another name on the spoken key {key!r}")
blocks[key] = value.strip()
return Contexts(blocks=blocks, source_path=path)

View File

@ -16,11 +16,9 @@ import sys
import time
from pathlib import Path
from . import __version__, audio, grammar, inject, keys, target
from .config import Config, ConfigError, load_config
from . import __version__, audio, grammar, inject, target
from .config import Config
from .console import HELP, SYSTEM, VOICE, Console
from .contexts import Contexts, ContextsError, load_contexts
from .sound import Earcons
from .stt import Transcriber
log = logging.getLogger(__name__)
@ -78,16 +76,6 @@ def stop_running() -> bool:
return True
def reload_running() -> bool:
"""signal a running daemon (SIGHUP) to reload config + contexts. returns whether
one was found. no-op on platforms without SIGHUP."""
pid = read_pid()
if pid is None or not hasattr(signal, "SIGHUP"):
return False
os.kill(pid, signal.SIGHUP)
return True
class _PTTKey:
"""desk-only push-to-talk: 'held' while the configured key is down in the
daemon's own terminal. there is deliberately NO global hotkey — a system-wide
@ -124,35 +112,22 @@ class Daemon:
self.config = config
self.mode = config.mode
self._stop = False
self._reload_pending = False
self._transcriber: Transcriber | None = None
self._device: int | None = None
self._ptt = _PTTKey()
self._pending: dict[str, int] = {}
self._console = Console()
self._contexts = Contexts()
self._earcons = Earcons(config)
self._last_stt_ms = 0.0
self._last_audio_s = 0.0
def _install_signals(self) -> None:
signal.signal(signal.SIGTERM, self._on_signal)
signal.signal(signal.SIGINT, self._on_signal)
if hasattr(signal, "SIGHUP"):
signal.signal(signal.SIGHUP, self._on_reload_signal)
def _on_signal(self, _signum, _frame) -> None:
log.info("stop requested")
self._stop = True
def _on_reload_signal(self, _signum, _frame) -> None:
"""SIGHUP from `claudedo reload` -> reload both config files on the next tick.
the actual reload runs in the loop (not the handler) so it never races a
capture/transcribe; the handler only sets the flag.
"""
self._reload_pending = True
def stopped(self) -> bool:
return self._stop
@ -165,21 +140,11 @@ class Daemon:
compute_type="auto",
initial_prompt=grammar.initial_prompt(cfg.wake_phrases),
)
self._load_contexts()
if audio.warm_up(cfg.samplerate, cfg.channels, self._device):
log.info("mic warmed up (source live)")
else:
log.warning("mic warm-up saw only silence — check mic permission / RDPSource")
def _load_contexts(self) -> None:
"""(re)load contexts.toml, leaving the loaded model untouched. a parse error is
logged and leaves the previous set in place rather than crashing the loop."""
try:
self._contexts = load_contexts()
except ContextsError as exc:
log.warning("contexts.toml invalid, keeping previous set: %s", exc)
self._console.emit(SYSTEM, f"contexts.toml error (kept previous): {exc}", "red")
def _capture(self):
cfg = self.config
if self.mode == "ptt":
@ -204,14 +169,10 @@ class Daemon:
parsed = grammar.parse(transcript, cfg.wake_phrases, cfg.wake_fuzzy_threshold,
cfg.command_fuzzy_threshold, require_wake, filler=cfg.filler_words)
if parsed is None or parsed.action is None:
if parsed is not None:
self._earcons.play("wake")
self._console.emit(VOICE, f'heard "{transcript}" -> no command matched {self._timing()}',
"yellow")
self._earcons.play("no_match")
return
action = parsed.action
self._earcons.play("wake")
# a command was recognized — echo what we heard (green) before acting. note the
# matched wake phrase (magenta) when the transcript didn't literally contain it
@ -256,9 +217,7 @@ class Daemon:
self._console.line(f" {self._console.paint(f'{usage:<26}', 'brightblue')} {desc}")
return
if action.name == "customs":
names = self._contexts.names()
listed = ", ".join(names) if names else "(none — edit contexts.toml)"
self._console.emit(SYSTEM, f"contexts: {listed}")
self._console.emit(SYSTEM, "custom commands arrive in v0.2.0 (contexts.toml)")
return
if action.name == "version":
self._console.emit(SYSTEM, f"claudedo {__version__}")
@ -266,27 +225,11 @@ class Daemon:
if action.name == "debug":
self._console.emit(VOICE, f'debug: "{action.arg}"', "yellow")
return
if action.name == "reload":
self._do_reload(str(action.arg))
return
if action.name == "system":
self._do_system(action.arg)
return
if action.name == "context":
name = str(action.arg[0])
if self._contexts.get(name) is None:
self._console.emit(VOICE, f"no context named '{name}' -> did nothing", "red")
self._earcons.play("no_match")
return
session, reason = target.resolve(parsed.one_shot, auto_target=cfg.auto_target)
if session is None:
self._console.emit(VOICE, f'heard "{transcript}" -> {reason} -> '
f'{self._describe(action)} did nothing', "red")
self._earcons.play("no_match")
return
if action.name == "context":
self._inject_context(session, action)
return
self._inject(session, action)
@ -295,19 +238,16 @@ class Daemon:
buffer so backspace/erase delete only back to the last submit boundary.
the 'heard ...' echo is already printed by _handle and the [session] prefix
names the target, so these lines just report the keystrokes injected. the
earcon fires here (a real injection): submit chimes the submit tone, every
other injected command the accept tone.
names the target, so these lines just report the keystrokes injected.
"""
name = action.name
self._earcons.play("submit" if name == "submit" else "accept")
if name == "type":
text = str(action.arg)
inject.send_literal(session, text)
self._pending[session] = self._pending.get(session, 0) + len(text)
if self.config.type_autosend:
inject.send_named(session, keys.SUBMIT)
inject.send_named(session, inject.keys.SUBMIT)
self._pending[session] = 0
self._console.emit(session, f"typed {text!r}"
+ (" + send" if self.config.type_autosend else ""), "green")
@ -338,96 +278,12 @@ class Daemon:
self._pending[session] = 0
self._console.emit(session, f"injected {self._describe(action)}", "green")
def _inject_context(self, session: str, action) -> None:
"""inject a named context blurb ahead of the dictated instruction, then WAIT.
read-before-send: never auto-submits the user says ``send`` separately, and
claude's own permission prompt is the backstop for anything consequential.
routes through inject.send_literal (the same path as ``type``) and tracks the
uncommitted-input buffer so backspace/erase still bound to the last boundary.
assembly (config behavior.context_multiline): true -> blurb, a soft Shift+Enter
newline, then the instruction; false -> blurb + context_separator + instruction
flattened onto one line. a bare ``context <name>`` (no dictation) injects just
the blurb. the soft newline does not count toward the editable-char buffer.
"""
cfg = self.config
name, dictation = str(action.arg[0]), str(action.arg[1])
blurb = self._contexts.get(name) or ""
self._earcons.play("accept")
inject.send_literal(session, blurb)
chars = len(blurb)
if dictation:
if cfg.context_multiline:
inject.send_named(session, keys.NEWLINE)
else:
inject.send_literal(session, cfg.context_separator)
chars += len(cfg.context_separator)
inject.send_literal(session, dictation)
chars += len(dictation)
self._pending[session] = self._pending.get(session, 0) + chars
shape = "blurb" if not dictation else "blurb + dictation"
self._console.emit(session, f"context '{name}' -> {shape} (waiting for send)", "green")
def _do_reload(self, scope: str) -> None:
"""re-read config.toml and/or contexts.toml live without reinitializing the
loaded whisper model (the slow part). scope: all|config|contexts."""
did = []
if scope in ("all", "config"):
try:
new_cfg = load_config()
self._apply_config(new_cfg)
did.append("config")
except ConfigError as exc:
self._console.emit(SYSTEM, f"config reload failed (kept previous): {exc}", "red")
if scope in ("all", "contexts"):
self._load_contexts()
did.append("contexts")
what = " + ".join(did) if did else "nothing"
blue = self._console.paint("reloaded", "brightblue")
self._console.emit(SYSTEM, f"{blue} {what} ({len(self._contexts)} contexts)")
def _apply_config(self, new_cfg: Config) -> None:
"""swap in a reloaded config, preserving the runtime mode the user may have
toggled by voice and leaving the already-loaded transcriber untouched."""
new_cfg.mode = self.mode
self.config = new_cfg
self._earcons.update(new_cfg)
def _do_system(self, arg) -> None:
"""daemon-control namespace (never injects to claude): status / reload."""
if isinstance(arg, tuple) and arg and arg[0] == "reload":
self._do_reload(str(arg[1]))
return
if isinstance(arg, tuple) and arg and arg[0] == "unknown":
self._console.emit(SYSTEM, f"unknown system command '{arg[1]}'", "red")
return
if arg == "status":
cfg = self.config
sticky = target.read_active() or "(none)"
blue = self._console.paint("status", "brightblue")
self._console.emit(SYSTEM, f"{blue}: mode {self.mode}, sticky {sticky}, "
f"model {cfg.stt_model}, {len(self._contexts)} contexts")
return
self._console.emit(SYSTEM, f"unknown system command {arg!r}", "red")
def _timing(self) -> str:
"""compact STT latency suffix for heard lines (transcribe ms on audio secs)"""
return f"({self._last_stt_ms:.0f}ms/{self._last_audio_s:.1f}s)"
@staticmethod
def _describe(action) -> str:
if action.name == "context":
name, dictation = action.arg
tail = " + dictation" if dictation else ""
return f"CONTEXT('{name}'{tail})"
if action.name == "system":
arg = action.arg
if isinstance(arg, tuple):
return f"SYSTEM({arg[0]} {arg[1]})"
return f"SYSTEM({arg})"
if action.arg is None:
return action.name.upper()
return f"{action.name.upper()}({action.arg})"
@ -448,7 +304,7 @@ class Daemon:
target_now = target.read_active() or "(none — run cc / set <name>)"
self._console.emit(SYSTEM, f"claudedo {self.mode} mode — Ctrl-C to stop", "bold")
self._console.emit(SYSTEM, f"model {cfg.stt_model} ({cfg.stt_language}) · mic {dev} · "
f"target {target_now} · {len(self._contexts)} contexts")
f"target {target_now}")
wakes = ", ".join(self._console.paint(p, "magenta") for p in cfg.wake_phrases)
self._console.emit(SYSTEM, f"wake: {wakes}")
@ -465,9 +321,6 @@ class Daemon:
self._refresh_state()
self._print_startup()
while not self._stop:
if self._reload_pending:
self._reload_pending = False
self._do_reload("all")
audio_chunk = self._capture()
if self._stop:
break

View File

@ -54,10 +54,6 @@ _COMMANDS_VERBS = ("commands", "help", "menu")
_CUSTOMS_VERBS = ("customs", "custom")
_VERSION_VERBS = ("version",)
_SELECT_VERBS = ("select", "option", "choose", "number")
_CONTEXT_VERBS = ("context", "prepare")
_RELOAD_VERBS = ("reload",)
_SYSTEM_VERBS = ("system",)
_RELOAD_SCOPES = ("config", "contexts")
# every command/synonym word, for biasing the STT toward the vocabulary we expect.
_COMMAND_WORDS = (
@ -65,7 +61,6 @@ _COMMAND_WORDS = (
+ _CANCEL_VERBS + _TYPE_VERBS + _BACKSPACE_VERBS + _SPACE_VERBS + _ADD_VERBS
+ _ERASE_VERBS + _DEBUG_VERBS + _MODE_VERBS + _STICKY_VERBS + _ONESHOT_VERBS + _UNSET_VERBS
+ _LIST_VERBS + _COMMANDS_VERBS + _CUSTOMS_VERBS + _VERSION_VERBS
+ _CONTEXT_VERBS + _RELOAD_VERBS + _SYSTEM_VERBS + _RELOAD_SCOPES
+ _SELECT_VERBS + ("ptt", "listen")
+ ("one", "two", "three", "four")
)
@ -77,12 +72,9 @@ class Action:
"""a matched command: a name plus an optional argument.
names: yes, no, select, approve, deny, submit, type, space, backspace, erase,
cancel, mode, set, unset, list, context, reload, system. arg carries the select
index (int), the literal text for ``type``, the count for ``space``/``backspace``
(int), the mode for ``mode``, the session short-name for ``set``, a
``(name, dictation)`` tuple for ``context``, the scope string for ``reload``
(``"all"``/``"config"``/``"contexts"``), or the system control for ``system``
(``"status"`` or a ``("reload", scope)`` tuple).
cancel, mode, set, unset, list. arg carries the select index (int), the literal
text for ``type``, the count for ``space``/``backspace`` (int), the mode for
``mode``, or the session short-name for ``set``.
"""
name: str
@ -157,11 +149,7 @@ def command_menu() -> list[tuple[str, str]]:
("target <name> <cmd>", "one-shot to another session"),
("unset / list", "clear sticky / list sessions"),
("mode ptt|listen", "switch input mode"),
("context <name> <text>", "inject a contexts.toml blurb + dictation (no submit)"),
("reload", "re-read config.toml + contexts.toml live"),
("system status", "print mode/target/model/contexts to the console"),
("system reload [config|contexts]", "reload one or both config files"),
("commands / customs", "this menu / list loaded contexts"),
("commands / customs", "this menu / custom commands (v0.2.0)"),
("version", "print the claudedo version"),
]
@ -244,39 +232,6 @@ def _leading_count(rest: list[str], default: int = 1) -> int:
return default
def _match_reload(rest: list[str], threshold: float, bare_default: str) -> Action | None:
"""map the tokens after a ``reload`` verb to a reload Action.
bare reload -> the caller's default scope ("all" for the bare command, the
``("reload", scope)`` tuple for ``system reload``). a trailing ``config``/
``contexts`` narrows the scope; an unrecognized scope falls back to the default.
"""
scope = bare_default
if rest and _fuzzy_in(rest[0], ("config", "configuration"), threshold):
scope = "config"
elif rest and _fuzzy_in(rest[0], ("contexts", "context"), threshold):
scope = "contexts"
return Action("reload", scope)
def _match_system(rest: list[str], threshold: float) -> Action | None:
"""map the tokens after the reserved ``system`` word to a daemon-control Action.
the ``system`` namespace never injects into claude. v0.2.0 scope: ``status`` and
``reload [config|contexts]``. unknown controls return a ``system`` Action with an
``("unknown", word)`` arg so the daemon can report it rather than silently drop.
"""
if not rest:
return Action("system", "status")
head = rest[0]
if _fuzzy_in(head, _RELOAD_VERBS, threshold):
inner = _match_reload(rest[1:], threshold, bare_default="all")
return Action("system", ("reload", inner.arg))
if _fuzzy_in(head, ("status", "state"), threshold):
return Action("system", "status")
return Action("system", ("unknown", head))
def match_command(remainder: str, threshold: float) -> Action | None:
"""map a normalized command remainder to an Action, or None if unrecognized.
@ -291,15 +246,6 @@ def match_command(remainder: str, threshold: float) -> Action | None:
head = tokens[0]
rest = tokens[1:]
if _fuzzy_in(head, _SYSTEM_VERBS, threshold):
return _match_system(rest, threshold)
if _fuzzy_in(head, _RELOAD_VERBS, threshold):
return _match_reload(rest, threshold, bare_default="all")
if _fuzzy_in(head, _CONTEXT_VERBS, threshold) and rest:
name = rest[0]
dictation = " ".join(rest[1:]).strip()
return Action("context", (name, dictation))
if head in _INDEX_WORDS:
return Action("select", _INDEX_WORDS[head])

View File

@ -37,13 +37,6 @@ DENY = ["3"]
SUBMIT = ["Enter"]
CANCEL = ["Escape"]
# NEWLINE is a soft newline inside the input box that does NOT submit — Shift+Enter,
# which tmux names ``S-Enter`` (requires the extended-keys / xterm extkeys tmux
# settings install.sh appends). used to separate a context blurb from the dictated
# instruction in multiline assembly; if it proves flaky the daemon flattens to one
# line with a separator instead (behavior.context_multiline = false).
NEWLINE = ["S-Enter"]
# BACKSPACE deletes one char left; SPACE inserts one literal space. both are emitted
# repeatedly for `backspace <n>` / `space <n>` and for `erase` (n = the daemon's
# tracked uncommitted-input count). BSpace is tmux's name for the backspace key.

View File

@ -1,91 +0,0 @@
"""earcons — short confirmation tones on daemon events, the eyes-free feedback layer.
the single place that maps an event name to its tone file and the per-event enable
flag. additive to the console feed (it does not replace the printed lines): at the desk
mute tones and read; eyes-free, hear them. playback goes through audio_out (paplay-first,
fire-and-forget) so a dead speaker never blocks or breaks a command.
events:
wake a wake phrase was recognized (off by default a blip right before you
speak the command can bleed into its capture; keep it off unless wanted)
accept a command was recognized/injected
no_match nothing matched, or the target was missing (did nothing)
submit a send/submit was injected
tone files live in the packaged sounds/ dir; a per-event config override may point at a
user file instead. a missing file is swallowed by audio_out (logged once), never raised.
"""
from __future__ import annotations
import logging
from pathlib import Path
from . import audio_out
from .config import Config
log = logging.getLogger(__name__)
_SOUNDS_DIR = Path(__file__).resolve().parent / "sounds"
_EVENT_FILES = {
"wake": "wake.wav",
"accept": "accepted.wav",
"no_match": "no_match.wav",
"submit": "sent.wav",
}
_EVENT_FLAGS = {
"wake": "on_wake",
"accept": "on_accept",
"no_match": "on_no_match",
"submit": "on_submit",
}
class Earcons:
"""resolves daemon events to tones and plays them per the [sound] config"""
def __init__(self, config: Config) -> None:
self._apply(config)
def update(self, config: Config) -> None:
"""re-read the [sound] config after a live reload"""
self._apply(config)
def _apply(self, config: Config) -> None:
self.enabled = config.sound_enabled
self.volume = config.sound_volume
self._flags = {
"wake": config.sound_on_wake,
"accept": config.sound_on_accept,
"no_match": config.sound_on_no_match,
"submit": config.sound_on_submit,
}
self._overrides = dict(config.sound_files)
def _resolve(self, event: str) -> Path | None:
override = self._overrides.get(event) or self._overrides.get(_EVENT_FLAGS[event])
if override:
return Path(override).expanduser()
name = _EVENT_FILES.get(event)
return _SOUNDS_DIR / name if name else None
def play(self, event: str) -> None:
"""play the tone for an event if enabled (master + per-event). fire-and-forget;
unknown/disabled events and missing files are silently no-ops."""
if not self.enabled or not self._flags.get(event, False):
return
path = self._resolve(event)
if path is None:
return
audio_out.play(path, volume=self.volume, blocking=False)
def tone_path(self, event: str) -> Path | None:
"""the resolved tone path for an event (for test-tone), ignoring enable flags"""
return self._resolve(event)
def event_names() -> list[str]:
"""the earcon event names in a stable order (for test-tone iteration)"""
return ["wake", "accept", "no_match", "submit"]

View File

@ -1 +0,0 @@
"""earcon tone assets (committed .wav files) + their generator (generate.py)"""

Binary file not shown.

View File

@ -1,75 +0,0 @@
"""synthetic-beep FALLBACK generator for the earcon .wav tones.
WARNING: the shipped tones in this directory are now CUSTOM CURATED recordings
(edge-trimmed + loudness-normalized to ~-16 dB RMS with a -1 dBTP ceiling), NOT this
script's output. running this script OVERWRITES those real tones with plain synthetic
beeps only do so if you deliberately want to fall back to generated placeholders. it
is kept as a bootstrap fallback so the package can always self-generate a tone set (the
"a missing tone must never break a command" guarantee), not as the source of the
committed wavs.
run ``python -m claudedo.sounds.generate`` (or ``python generate.py`` from this dir) to
write placeholder beeps. each is a short, quiet, fade-enveloped sine/triangle at a
distinct pitch so the four events are ear-distinguishable:
wake soft single mid blip (off by default; least intrusive)
accepted bright single high note (heard you, sent it)
no_match low two-note falling buzz (heard you, but nothing matched / error)
sent two-note rising chime (submitted to claude)
kept SHORT (<300ms) and quiet (amplitude 0.4) confirmations, not alarms.
"""
from __future__ import annotations
import struct
import wave
from pathlib import Path
SAMPLE_RATE = 44100
AMPLITUDE = 0.4
HERE = Path(__file__).resolve().parent
def _tone(freq: float, dur: float) -> list[float]:
import math
n = int(SAMPLE_RATE * dur)
fade = max(1, int(SAMPLE_RATE * 0.01))
out = []
for i in range(n):
env = min(1.0, i / fade, (n - i) / fade)
out.append(math.sin(2.0 * math.pi * freq * (i / SAMPLE_RATE)) * env * AMPLITUDE)
return out
def _silence(dur: float) -> list[float]:
return [0.0] * int(SAMPLE_RATE * dur)
def _write(name: str, samples: list[float]) -> Path:
path = HERE / name
with wave.open(str(path), "wb") as wf:
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(SAMPLE_RATE)
clipped = (max(-1.0, min(1.0, s)) for s in samples)
wf.writeframes(b"".join(struct.pack("<h", int(s * 32767)) for s in clipped))
return path
def generate() -> list[Path]:
"""(re)write all earcon wavs; return the written paths"""
tones = {
"wake.wav": _tone(660.0, 0.12),
"accepted.wav": _tone(988.0, 0.14),
"no_match.wav": _tone(330.0, 0.10) + _silence(0.03) + _tone(247.0, 0.12),
"sent.wav": _tone(784.0, 0.10) + _silence(0.02) + _tone(1175.0, 0.12),
}
return [_write(name, samples) for name, samples in tones.items()]
if __name__ == "__main__":
for p in generate():
print(f"wrote {p}")

Binary file not shown.

Binary file not shown.

Binary file not shown.