feat: terminal-run only — drop systemd/autostart, start does mic-check + visible loop

terminal-run is the product, so remove all backgrounding: delete the claudedo.service unit and autostart.sh, strip the systemd step and the autostart source-line from install.sh (rc block now sources cc.sh only). claudedo start now runs a mic check first (warm-up + brief capture, aborts with guidance if silent; --skip-audio-check to bypass) then drops into a visible listen loop printing the recognition/action log: a startup banner, then heard -> matched -> target / injected per utterance, target/mode state changes, and (listen mode) non-wake speech dropped WITHOUT the transcript per the privacy invariant. Signed-off-by: disqualifier <dev@disqualifier.me>
fix: prime mic to skip RDPSource resume gap
2026-06-25 19:30:36 -04:00 · 2026-06-25 19:09:08 -04:00 · 2026-06-25 18:42:34 -04:00 · 2026-06-25 18:42:26 -04:00 · 2026-06-25 18:42:22 -04:00 · 2026-06-25 18:42:17 -04:00
12 changed files with 1261 additions and 103 deletions
--- a/README.md
+++ b/README.md
@ -61,37 +61,23 @@ claudedo test-audio
 ## Usage
 **Run it in a terminal you watch — that's the product.** You launch `claudedo
 start`, it does a quick mic check, then drops into a visible listen loop that prints
 `heard → matched → sent` for every utterance. That terminal is your
 recognition/action console; you attach to the `claude-<name>` session in another pane
 to watch the keystrokes land. There is no backgrounding/daemon mode — the whole point
 is the console you read.
 ```bash
-claudedo start            # run the daemon (foreground; listen mode by default)
+claudedo start            # mic-check, then the visible listen loop (listen mode default)
 claudedo start --mode ptt # push-to-talk instead (desk-only — see Modes)
 claudedo start --skip-audio-check  # skip the pre-listen mic check
 claudedo status           # running? mode? target session?
 claudedo stop             # stop a running daemon
 claudedo switch <name>    # retarget to claude-<name>
 claudedo test-audio       # verify the mic capture path
 ```
 Background it in its own tmux session:
 ```bash
 tmux new-session -d -s claudedo 'claudedo start'
 ```
 ### Autostart
 WSL has no real boot, so autostart is rc-based and **opt-in**. `install.sh` ships
 `~/.config/claudedo/autostart.sh`, which starts the daemon in a `claudedo-daemon`
 tmux session once per WSL session — but only when `CLAUDEDO_AUTOSTART=1` is set.
 Enable it by uncommenting the `export CLAUDEDO_AUTOSTART=1` line in the cc-kit marker
 block of your rc; disable it by re-commenting (or deleting the file). Watch its logs
 with `tmux attach -t claudedo-daemon`.
 If your WSL runs systemd (`systemd=true` in `/etc/wsl.conf`), `install.sh` also
 installs an optional user unit — enable it instead with:
 ```bash
 systemctl --user enable --now claudedo
 ```
 ### Modes
 - **listen (default)** — continuous capture; only acts on utterances that **start
--- a/install.sh
+++ b/install.sh
@ -0,0 +1,156 @@
 #!/usr/bin/env bash
 # claudedo bootstrap — does the system setup pip can't. idempotent: re-running is
 # safe and won't duplicate the shell-rc cc kit. run from the repo root.
 set -euo pipefail
 REPO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 ASOUNDRC="$HOME/.asoundrc"
 MARKER_BEGIN="# >>> claudedo cc kit >>>"
 MARKER_END="# <<< claudedo cc kit <<<"
 say() { printf '\n\033[1;36m==> %s\033[0m\n' "$*"; }
 warn() { printf '\033[1;33m!! %s\033[0m\n' "$*" >&2; }
 die() { printf '\033[1;31mxx %s\033[0m\n' "$*" >&2; exit 1; }
 # 1. windows-side checks (cannot automate — check and instruct) -----------------
 say "checking WSLg audio bridge"
 if [ ! -e /mnt/wslg/PulseServer ]; then
    die "WSLg PulseServer missing (/mnt/wslg/PulseServer). claudedo needs WSLg audio.
    update WSL ('wsl --update' in Windows) or install WSL from the Microsoft Store,
    then restart WSL ('wsl --shutdown') and re-run this script."
 fi
 echo "  /mnt/wslg/PulseServer present"
 cat <<'EOF'
  MANUAL WINDOWS STEP (this script cannot do it for you):
    Windows Settings -> Privacy & security -> Microphone ->
      enable "Let desktop apps access your microphone".
    Without this, the mic is silent inside WSL. Do it now if you haven't.
 EOF
 # 2. WSL audio deps (apt) -------------------------------------------------------
 say "installing WSL audio dependencies (apt)"
 sudo apt-get update
 sudo apt-get install -y libportaudio2 libasound2t64 libasound2-plugins \
                        alsa-utils pulseaudio-utils
 # 3. ALSA -> Pulse routing ------------------------------------------------------
 say "configuring ALSA -> Pulse routing (~/.asoundrc)"
 if [ -f "$ASOUNDRC" ] && grep -q "type pulse" "$ASOUNDRC"; then
    echo "  ~/.asoundrc already routes to pulse"
 else
    {
        echo "pcm.!default { type pulse }"
        echo "ctl.!default { type pulse }"
    } >> "$ASOUNDRC"
    echo "  wrote pulse default to ~/.asoundrc"
 fi
 if [ -z "${PULSE_SERVER:-}" ] && [ -e /mnt/wslg/PulseServer ]; then
    export PULSE_SERVER="unix:/mnt/wslg/PulseServer"
    echo "  exported PULSE_SERVER=$PULSE_SERVER (WSLg usually sets this already)"
 fi
 # 4. verify audio (fail loudly with guidance) -----------------------------------
 say "verifying audio path"
 if pactl info >/dev/null 2>&1; then
    DEFAULT_SRC="$(pactl info | sed -n 's/^Default Source: //p')"
    echo "  Default Source: ${DEFAULT_SRC:-<none>}"
    if ! pactl list sources short 2>/dev/null | grep -q RDPSource; then
        warn "RDPSource not listed by pactl — mic may not be bridged. check Windows mic permission."
    fi
 else
    warn "pactl info failed — pulseaudio-utils installed but no server reachable yet."
 fi
 TESTWAV="/tmp/claudedo_test.wav"
 if arecord -D default -f S16_LE -c 1 -r 16000 -d 2 "$TESTWAV" >/dev/null 2>&1 && [ -s "$TESTWAV" ]; then
    echo "  arecord captured 2s -> $TESTWAV ($(stat -c%s "$TESTWAV") bytes)"
 else
    warn "arecord could not capture. fix-chain: apt deps above + ~/.asoundrc + Windows mic permission.
    debug anytime with: claudedo test-audio"
 fi
 # 5. python install + model prime -----------------------------------------------
 say "installing the claudedo python package"
 PIP="${PIP:-pip3}"
 "$PIP" install -e "$REPO_DIR"
 say "priming the faster-whisper model (so first run isn't slow)"
 MODEL="$(sed -n 's/^model *= *"\(.*\)".*/\1/p' "$REPO_DIR/config.toml" | head -1)"
 MODEL="${MODEL:-small}"
 python3 - "$MODEL" <<'PY' || warn "model prime failed — first run will download it"
 import sys
 from faster_whisper import WhisperModel
 WhisperModel(sys.argv[1], device="cpu", compute_type="int8")
 print("  primed faster-whisper model:", sys.argv[1])
 PY
 # 6. cc kit as a sourced file + rc wiring (idempotent) --------------------------
 say "installing the cc kit (~/.config/claudedo/cc.sh)"
 CONF_DIR="$HOME/.config/claudedo"
 mkdir -p "$CONF_DIR"
 install -m 0644 "$REPO_DIR/shell/cc.sh" "$CONF_DIR/cc.sh"
 echo "  wrote $CONF_DIR/cc.sh"
 # wire EVERY rc that exists (the user may have both zsh and bash).
 wired_any=0
 for RC in "$HOME/.zshrc" "$HOME/.bashrc"; do
    [ -f "$RC" ] || continue
    wired_any=1
    if grep -qF "$MARKER_BEGIN" "$RC"; then
        echo "  cc kit marker already in $RC (not duplicating)"
        continue
    fi
    cp "$RC" "$RC.claudedo.bak"
    echo "  backed up $RC -> $RC.claudedo.bak"
    cat >> "$RC" <<'CCKIT'
 # >>> claudedo cc kit >>>
 [ -f ~/.config/claudedo/cc.sh ] && source ~/.config/claudedo/cc.sh
 # <<< claudedo cc kit <<<
 CCKIT
    echo "  wired source-line block into $RC (open a new shell or 'source $RC')"
 done
 [ "$wired_any" = 1 ] || warn "no ~/.zshrc or ~/.bashrc found — add the marker block from README.md manually."
 # warn about any OLD loose cc defs outside our markers (do not auto-delete).
 for RC in "$HOME/.zshrc" "$HOME/.bashrc"; do
    [ -f "$RC" ] || continue
    loose="$(grep -nE '^[[:space:]]*(cc|ccr|ccl|cck|cckl|_cc_name)[[:space:]]*\(\)' "$RC" \
             | grep -v 'claudedo' || true)"
    if [ -n "$loose" ]; then
        warn "old cc-function defs found in $RC (outside the claudedo markers):"
        echo "$loose" | sed 's/^/    /'
        echo "    review and remove them by hand — the new sourced kit overrides them, but"
        echo "    they are dead code. a backup is at $RC.claudedo.bak"
    fi
 done
 # 7. tmux settings for reliable send-keys (idempotent ~/.tmux.conf append) -------
 say "configuring tmux for reliable send-keys (~/.tmux.conf)"
 TMUX_CONF="$HOME/.tmux.conf"
 TMUX_MARKER="# >>> claudedo tmux >>>"
 touch "$TMUX_CONF"
 if grep -qF "$TMUX_MARKER" "$TMUX_CONF"; then
    echo "  claudedo tmux block already present (not duplicating)"
 else
    cat >> "$TMUX_CONF" <<'TMUXCONF'
 # >>> claudedo tmux >>>
 # settings for reliable keystroke injection + notifications (do not edit inside the
 # markers; re-run install.sh to refresh). escape-time 0 stops injected Escape from
 # being misread; allow-passthrough + extended-keys let notifications and modified
 # keys (Shift+Enter) reach the claude pane; the larger history-limit keeps scrollback.
 set -g escape-time 0
 set -g history-limit 50000
 set -g allow-passthrough on
 set -s extended-keys on
 set -as terminal-features 'xterm*:extkeys'
 # <<< claudedo tmux <<<
 TMUXCONF
    echo "  appended claudedo tmux settings to $TMUX_CONF (reload: tmux source-file ~/.tmux.conf)"
 fi
 say "done. next: 'claudedo test-audio' then 'claudedo start'"
--- a/shell/cc.sh
+++ b/shell/cc.sh
@ -0,0 +1,67 @@
 # claudedo cc kit — claude-code-in-tmux session helpers.
 # POSIX sh; sources cleanly under bash and zsh. side-effect-free on source
 # (function definitions only — nothing runs at source time).
 #
 # every command REQUIRES an explicit project name. the session is always
 # "claude-<name>", a stable speakable handle: "cc libs" -> claude-libs, which the
 # voice daemon targets with "claudedo target libs" / "switch libs". the name->session
 # mapping here MUST match target.py's session_name() in the daemon.
 #
 #   cc <name>    start or reattach to claude-<name>; writes ~/.claude-active
 #   ccr <name>   reattach only (error if it doesn't exist); writes ~/.claude-active
 #   ccl          list running claude- sessions
 #   cck <name>   kill claude-<name>
 #   cckl         kill ALL claude- sessions
 cc() {
    if [ -z "$1" ]; then
        echo "usage: cc <project-name>" >&2
        return 1
    fi
    session="claude-$1"
    echo "$session" > "$HOME/.claude-active"
    if tmux has-session -t "$session" 2>/dev/null; then
        tmux attach -t "$session"
    else
        tmux new-session -s "$session" "claude"
    fi
 }
 ccr() {
    if [ -z "$1" ]; then
        echo "usage: ccr <project-name>" >&2
        return 1
    fi
    session="claude-$1"
    if tmux has-session -t "$session" 2>/dev/null; then
        echo "$session" > "$HOME/.claude-active"
        tmux attach -t "$session"
    else
        echo "no session '$session' — run 'cc $1' to start one" >&2
        return 1
    fi
 }
 ccl() {
    tmux ls 2>/dev/null | grep '^claude-' || echo "no claude sessions running"
 }
 cck() {
    if [ -z "$1" ]; then
        echo "usage: cck <project-name>" >&2
        return 1
    fi
    session="claude-$1"
    if tmux kill-session -t "$session" 2>/dev/null; then
        echo "killed $session"
    else
        echo "no session '$session'" >&2
        return 1
    fi
 }
 cckl() {
    tmux ls 2>/dev/null | grep '^claude-' | cut -d: -f1 | while read -r s; do
        tmux kill-session -t "$s" && echo "killed $s"
    done
 }
--- a/src/claudedo/init.py
+++ b/src/claudedo/init.py
@ -1,3 +1,3 @@
-"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)."""
+"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)"""
 __version__ = "0.1.0"
--- a/src/claudedo/main.py
+++ b/src/claudedo/main.py
@ -0,0 +1,226 @@
 """claudedo CLI: start | stop | status | test-audio | install"""
 from __future__ import annotations
 import argparse
 import logging
 import subprocess
 import sys
 import wave
 from pathlib import Path
 from . import __version__, daemon, target
 from .config import Config, ConfigError, load_config
 def _setup_logging(verbose: bool) -> None:
    logging.basicConfig(
        level=logging.DEBUG if verbose else logging.INFO,
        format="%(asctime)s %(levelname)s %(name)s: %(message)s",
        datefmt="%H:%M:%S",
    )
 def _load_or_die(path: str | None) -> Config:
    try:
        return load_config(path)
    except ConfigError as exc:
        print(f"config error: {exc}", file=sys.stderr)
        raise SystemExit(2)
 def cmd_start(args: argparse.Namespace) -> int:
    config = _load_or_die(args.config)
    if args.mode:
        config.mode = args.mode
    if not args.skip_audio_check:
        print("checking mic before listening (speak briefly) ...")
        peak = _probe_mic(config, seconds=2.0, verbose=False)
        if peak is None or peak < 0.02:
            print("mic check failed — no usable input.", file=sys.stderr)
            print("run `claudedo test-audio` to debug; or `claudedo start --skip-audio-check`",
                  file=sys.stderr)
            return 1
        print(f"mic OK (peak {peak:.3f}).")
    try:
        daemon.run_daemon(config)
    except RuntimeError as exc:
        print(str(exc), file=sys.stderr)
        return 1
    return 0
 def _probe_mic(config: Config, seconds: float, verbose: bool):
    """warm up the mic then capture for `seconds`; return peak amplitude or None.
    None signals a hard capture failure (no PortAudio / device error) with guidance
    already printed; a float (possibly ~0) is a successful capture whose level the
    caller judges. shared by `start`'s precheck and `test-audio`.
    """
    from . import audio as audio_mod
    try:
        device = audio_mod.resolve_device(config.stt_device)
        if verbose:
            print("priming mic (RDPSource resumes from suspend) ...")
        audio_mod.warm_up(config.samplerate, config.channels, device)
        if verbose:
            print(f"capturing {seconds:.0f}s from "
                  f"device={device if device is not None else 'default'} — speak now ...")
        chunk = audio_mod.record_while(
            config.samplerate, config.channels, device,
            held=_timed_hold(seconds), max_utterance=seconds + 1.0, min_utterance=0.0,
        )
    except Exception as exc:
        print(f"audio capture FAILED: {exc}", file=sys.stderr)
        print("fix-chain: install.sh apt deps + ~/.asoundrc pulse shim + Windows mic permission",
              file=sys.stderr)
        return None
    if chunk is None or chunk.size == 0:
        print("captured no audio — check mic permission + RDPSource", file=sys.stderr)
        return None
    peak = float(abs(chunk).max())
    if verbose:
        out = Path("/tmp/claudedo_test.wav")
        _write_wav(out, chunk, config.samplerate)
        print(f"captured {chunk.size / config.samplerate:.1f}s, peak amplitude {peak:.3f} -> {out}")
    return peak
 def cmd_stop(_args: argparse.Namespace) -> int:
    if daemon.stop_running():
        print("sent stop signal to claudedo")
        return 0
    print("claudedo is not running")
    return 1
 def cmd_status(_args: argparse.Namespace) -> int:
    pid = daemon.read_pid()
    if pid is None:
        print("claudedo: not running")
        return 1
    state = daemon.read_state() or {}
    print(f"claudedo: running (pid {pid})")
    print(f"  mode:   {state.get('mode', '?')}")
    print(f"  target: {state.get('target') or '(none — run cc to attach)'}")
    return 0
 def _check_audio_tools() -> None:
    for tool in ("pactl", "arecord"):
        path = subprocess.run(["which", tool], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
        mark = "ok" if path.returncode == 0 else "MISSING (run install.sh)"
        print(f"  {tool}: {mark}")
 def cmd_test_audio(args: argparse.Namespace) -> int:
    config = _load_or_die(args.config)
    print("== claudedo test-audio ==")
    print("WSLg PulseServer:", "present" if Path("/mnt/wslg/PulseServer").exists() else "MISSING")
    _check_audio_tools()
    try:
        pactl = subprocess.run(["pactl", "info"], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
        if pactl.returncode == 0:
            for line in pactl.stdout.decode("utf-8", "replace").splitlines():
                if line.startswith("Default Source"):
                    print(" ", line.strip())
    except FileNotFoundError:
        pass
    from . import audio as audio_mod
    print("\nsounddevice input devices:")
    try:
        for idx, dev in enumerate(audio_mod.list_devices()):
            if dev.get("max_input_channels", 0) > 0:
                print(f"  [{idx}] {dev['name']} ({dev['max_input_channels']}ch)")
    except Exception as exc:
        print(f"  could not list devices: {exc}", file=sys.stderr)
    peak = _probe_mic(config, seconds=3.0, verbose=True)
    if peak is None:
        return 1
    if peak < 0.02:
        print("WARNING: near-silent capture — is the mic muted / permission denied?")
        print("fix-chain: Windows mic permission for desktop apps + a non-Krisp default input;")
        print("           if still silent, `wsl --shutdown` then reopen to re-attach RDPSource.")
        return 1
    print("mic OK.")
    return 0
 def _timed_hold(seconds: float):
    import time
    end = [None]
    def held() -> bool:
        now = time.monotonic()
        if end[0] is None:
            end[0] = now + seconds
        return now < end[0]
    return held
 def _write_wav(path: Path, chunk, samplerate: int) -> None:
    import numpy as np
    pcm = (np.clip(chunk, -1.0, 1.0) * 32767).astype("<i2")
    with wave.open(str(path), "wb") as wf:
        wf.setnchannels(1)
        wf.setsampwidth(2)
        wf.setframerate(samplerate)
        wf.writeframes(pcm.tobytes())
 def cmd_install(_args: argparse.Namespace) -> int:
    script = Path(__file__).resolve().parents[2] / "install.sh"
    if not script.is_file():
        print(f"install.sh not found at {script}", file=sys.stderr)
        return 1
    return subprocess.call(["bash", str(script)])
 def cmd_switch(args: argparse.Namespace) -> int:
    session = target.set_target(args.name)
    print(f"target -> {session}")
    return 0
 def build_parser() -> argparse.ArgumentParser:
    p = argparse.ArgumentParser(prog="claudedo", description="voice control for claude code")
    p.add_argument("--version", action="version", version=f"claudedo {__version__}")
    p.add_argument("-v", "--verbose", action="store_true", help="debug logging")
    p.add_argument("-c", "--config", help="path to config.toml")
    sub = p.add_subparsers(dest="command", required=True)
    sp = sub.add_parser("start", help="run the daemon (foreground)")
    sp.add_argument("--mode", choices=("listen", "ptt"), help="override input mode")
    sp.add_argument("--skip-audio-check", action="store_true",
                    help="skip the pre-listen mic check")
    sp.set_defaults(func=cmd_start)
    sub.add_parser("stop", help="stop a running daemon").set_defaults(func=cmd_stop)
    sub.add_parser("status", help="show daemon status").set_defaults(func=cmd_status)
    sub.add_parser("test-audio", help="verify the mic capture path").set_defaults(func=cmd_test_audio)
    sub.add_parser("install", help="re-run the bootstrap (install.sh)").set_defaults(func=cmd_install)
    sw = sub.add_parser("switch", help="set the active target session")
    sw.add_argument("name", help="project short-name (claude- prefix optional)")
    sw.set_defaults(func=cmd_switch)
    return p
 def main(argv: list[str] | None = None) -> int:
    parser = build_parser()
    args = parser.parse_args(argv)
    _setup_logging(getattr(args, "verbose", False))
    return args.func(args)
 if __name__ == "__main__":
    sys.exit(main())
--- a/src/claudedo/audio.py
+++ b/src/claudedo/audio.py
@ -0,0 +1,179 @@
 """mic capture via sounddevice — the WSL-hard part.
 device selection resolves config's stt.device ("auto" | index | name substring) to
 a concrete sounddevice input device. two capture paths:
  - record_until_silence(): listen mode — stream until trailing silence segments the
    utterance (no streaming STT; chunk-on-silence is enough for commands).
  - record_while(predicate): ptt mode — capture while predicate() is true (key held).
 the WSLg/PulseAudio path is verified separately by `claudedo test-audio`; if capture
 fails here the fix-chain is the apt deps + ~/.asoundrc + Windows mic permission.
 """
 from __future__ import annotations
 import logging
 import queue
 import time
 from typing import Callable
 import numpy as np
 log = logging.getLogger(__name__)
 class AudioError(Exception):
    """raised when no usable input device is found or capture fails"""
 def list_devices() -> list[dict]:
    """return sounddevice's device table (for test-audio / debugging)"""
    import sounddevice as sd
    return list(sd.query_devices())
 def resolve_device(spec: str) -> int | None:
    """resolve a device spec to a sounddevice input index, or None for default.
    spec: "auto" -> default input; a digit string -> that index; otherwise a
    case-insensitive substring of a device name with input channels.
    """
    import sounddevice as sd
    if spec in ("", "auto", "default"):
        return None
    if spec.isdigit():
        return int(spec)
    spec_low = spec.lower()
    for idx, dev in enumerate(sd.query_devices()):
        if dev.get("max_input_channels", 0) > 0 and spec_low in dev["name"].lower():
            return idx
    raise AudioError(f"no input device matching {spec!r}")
 def _rms(block: np.ndarray) -> float:
    if block.size == 0:
        return 0.0
    return float(np.sqrt(np.mean(np.square(block, dtype=np.float64))))
 def warm_up(samplerate: int, channels: int, device: int | None,
            timeout: float = 3.0) -> bool:
    """open a short stream and read until the source produces audio.
    WSLg's RDPSource suspends when idle and emits ~1-2s of silence while it resumes
    on the next read. priming here means the first real capture isn't lost to that
    warm-up gap. returns whether any non-silent block arrived before timeout (still
    safe to proceed either way — a truly silent mic just returns False).
    """
    import sounddevice as sd
    block_dur = 0.05
    blocksize = int(samplerate * block_dur)
    deadline = time.monotonic() + timeout
    with sd.InputStream(samplerate=samplerate, channels=channels, device=device,
                        dtype="float32", blocksize=blocksize) as stream:
        while time.monotonic() < deadline:
            block, _overflowed = stream.read(blocksize)
            mono = block.reshape(-1) if channels == 1 else block.mean(axis=1)
            if _rms(mono) > 0.0:
                return True
    return False
 def record_until_silence(samplerate: int, channels: int, device: int | None,
                         silence_threshold: float, silence_duration: float,
                         min_utterance: float, max_utterance: float,
                         stop: Callable[[], bool] | None = None) -> np.ndarray | None:
    """capture one utterance, ending after trailing silence. returns mono float32.
    blocks until speech is detected and then trailing silence segments it, or until
    stop() returns true (clean shutdown). returns None if stopped before any speech
    or if the captured utterance is shorter than min_utterance.
    """
    import sounddevice as sd
    block_dur = 0.05
    blocksize = int(samplerate * block_dur)
    q: "queue.Queue[np.ndarray]" = queue.Queue()
    def _cb(indata, _frames, _time, status):
        if status:
            log.debug("audio status: %s", status)
        q.put(indata.copy())
    collected: list[np.ndarray] = []
    speaking = False
    silence_run = 0.0
    started_at = time.monotonic()
    with sd.InputStream(samplerate=samplerate, channels=channels, device=device,
                        dtype="float32", blocksize=blocksize, callback=_cb):
        while True:
            if stop is not None and stop():
                break
            try:
                block = q.get(timeout=0.2)
            except queue.Empty:
                if not speaking and time.monotonic() - started_at > 600:
                    started_at = time.monotonic()
                continue
            mono = block.reshape(-1) if channels == 1 else block.mean(axis=1)
            level = _rms(mono)
            if level >= silence_threshold:
                speaking = True
                silence_run = 0.0
                collected.append(mono)
            elif speaking:
                silence_run += block_dur
                collected.append(mono)
                if silence_run >= silence_duration:
                    break
            if speaking and (time.monotonic() - started_at) > max_utterance:
                log.debug("utterance hit max_utterance cap")
                break
    if not collected:
        return None
    audio = np.concatenate(collected).astype(np.float32)
    if audio.size / samplerate < min_utterance:
        return None
    return audio
 def record_while(samplerate: int, channels: int, device: int | None,
                 held: Callable[[], bool], max_utterance: float,
                 min_utterance: float) -> np.ndarray | None:
    """capture while held() is true (push-to-talk). returns mono float32 or None"""
    import sounddevice as sd
    block_dur = 0.05
    blocksize = int(samplerate * block_dur)
    q: "queue.Queue[np.ndarray]" = queue.Queue()
    def _cb(indata, _frames, _time, status):
        if status:
            log.debug("audio status: %s", status)
        q.put(indata.copy())
    collected: list[np.ndarray] = []
    started_at = time.monotonic()
    with sd.InputStream(samplerate=samplerate, channels=channels, device=device,
                        dtype="float32", blocksize=blocksize, callback=_cb):
        while held():
            try:
                block = q.get(timeout=0.1)
            except queue.Empty:
                continue
            mono = block.reshape(-1) if channels == 1 else block.mean(axis=1)
            collected.append(mono)
            if (time.monotonic() - started_at) > max_utterance:
                break
    if not collected:
        return None
    audio = np.concatenate(collected).astype(np.float32)
    if audio.size / samplerate < min_utterance:
        return None
    return audio
--- a/src/claudedo/config.py
+++ b/src/claudedo/config.py
@ -1,4 +1,4 @@
-"""load and validate config.toml into a typed Config object with clear errors."""
+"""load and validate config.toml into a typed Config object with clear errors"""
 from __future__ import annotations
@ -10,7 +10,7 @@ from pathlib import Path
 try:
    import tomllib as _toml
    _TOML_BINARY = True
-except ModuleNotFoundError:  # python < 3.11
+except ModuleNotFoundError:
    import tomli as _toml
    _TOML_BINARY = True
@ -27,12 +27,12 @@ DEFAULT_CONFIG_PATHS = (
 class ConfigError(Exception):
-    """raised on a missing or invalid configuration value."""
+    """raised on a missing or invalid configuration value"""
@dataclass
 class Config:
-    """validated claudedo configuration."""
+    """validated claudedo configuration"""
    wake_phrases: list[str]
    mode: str
@ -53,7 +53,7 @@ class Config:
 def find_config_path(explicit: str | os.PathLike | None = None) -> Path:
-    """resolve the config file path, raising ConfigError if none is found."""
+    """resolve the config file path, raising ConfigError if none is found"""
    candidates: list[Path] = []
    if explicit:
        candidates.append(Path(explicit))
@ -79,7 +79,7 @@ def _require(table: dict, section: str, key: str, types: tuple, default=None):
 def load_config(explicit: str | os.PathLike | None = None) -> Config:
-    """load config.toml from the first existing default path (or an explicit one)."""
+    """load config.toml from the first existing default path (or an explicit one)"""
    path = find_config_path(explicit)
    try:
        with open(path, "rb") as fh:
--- a/src/claudedo/daemon.py
+++ b/src/claudedo/daemon.py
@ -0,0 +1,262 @@
 """the capture -> stt -> match -> inject loop.
 privacy invariant: in listen mode, any utterance that does not start with a wake
 phrase is discarded the instant grammar.parse() returns None — the transcript text
 is dropped and never stored or transmitted. nothing about non-command speech is
 persisted.
 """
 from __future__ import annotations
 import json
 import logging
 import os
 import signal
 import sys
 import time
 from pathlib import Path
 from . import audio, grammar, inject, target
 from .config import Config
 from .stt import Transcriber
 log = logging.getLogger(__name__)
 STATE_DIR = Path(os.environ.get("XDG_CACHE_HOME", str(Path.home() / ".cache"))) / "claudedo"
 PIDFILE = STATE_DIR / "claudedo.pid"
 STATEFILE = STATE_DIR / "state.json"
 def _ensure_state_dir() -> None:
    STATE_DIR.mkdir(parents=True, exist_ok=True)
 def write_state(pid: int, mode: str, target_session: str | None) -> None:
    """write the running daemon's status for `claudedo status` to read"""
    _ensure_state_dir()
    STATEFILE.write_text(json.dumps({
        "pid": pid,
        "mode": mode,
        "target": target_session,
        "since": time.time(),
    }), encoding="utf-8")
 def read_state() -> dict | None:
    """read the daemon status file, or None if absent/unreadable"""
    try:
        return json.loads(STATEFILE.read_text(encoding="utf-8"))
    except (FileNotFoundError, json.JSONDecodeError, OSError):
        return None
 def read_pid() -> int | None:
    """return the pid of a running daemon, or None (also clears stale pidfiles)"""
    try:
        pid = int(PIDFILE.read_text(encoding="utf-8").strip())
    except (FileNotFoundError, ValueError, OSError):
        return None
    try:
        os.kill(pid, 0)
    except ProcessLookupError:
        PIDFILE.unlink(missing_ok=True)
        return None
    except PermissionError:
        return pid
    return pid
 def stop_running() -> bool:
    """signal a running daemon to stop. returns whether one was found"""
    pid = read_pid()
    if pid is None:
        return False
    os.kill(pid, signal.SIGTERM)
    return True
 class _PTTKey:
    """desk-only push-to-talk: 'held' while the configured key is down in the
    daemon's own terminal. there is deliberately NO global hotkey — a system-wide
    keyboard hook is the keylogger/cheat silhouette claudedo refuses to install. for
    hands-free-while-gaming use listen mode (voice trigger over the mic bridge).
    implementation reads stdin in raw mode: press the key to start capture, press it
    again (or Enter) to stop. (terminals don't deliver key-up events, so true
    hold-to-talk isn't possible from a tty — this is press-toggle, documented.)
    """
    def __init__(self) -> None:
        self._tty = sys.stdin.isatty()
    def wait_press(self, stop) -> bool:
        import select
        if not self._tty:
            log.warning("ptt mode needs a tty; falling back to a 3s timed capture")
            time.sleep(0.2)
            return not stop()
        while not stop():
            r, _, _ = select.select([sys.stdin], [], [], 0.2)
            if r:
                sys.stdin.read(1)
                return True
        return False
 class Daemon:
    """owns the capture/transcribe/inject loop and runtime mode switching"""
    def __init__(self, config: Config) -> None:
        self.config = config
        self.mode = config.mode
        self._stop = False
        self._transcriber: Transcriber | None = None
        self._device: int | None = None
        self._ptt = _PTTKey()
    def _install_signals(self) -> None:
        signal.signal(signal.SIGTERM, self._on_signal)
        signal.signal(signal.SIGINT, self._on_signal)
    def _on_signal(self, _signum, _frame) -> None:
        log.info("stop requested")
        self._stop = True
    def stopped(self) -> bool:
        return self._stop
    def _load(self) -> None:
        cfg = self.config
        self._device = audio.resolve_device(cfg.stt_device)
        self._transcriber = Transcriber(
            model=cfg.stt_model, language=cfg.stt_language,
            device=cfg.stt_compute if cfg.stt_compute in ("cpu", "cuda") else "auto",
            compute_type="auto",
        )
        if audio.warm_up(cfg.samplerate, cfg.channels, self._device):
            log.info("mic warmed up (source live)")
        else:
            log.warning("mic warm-up saw only silence — check mic permission / RDPSource")
    def _capture(self):
        cfg = self.config
        if self.mode == "ptt":
            print("[ptt] press the capture key in this terminal, speak, then press again to stop")
            if not self._ptt.wait_press(self.stopped):
                return None
            return audio.record_while(
                cfg.samplerate, cfg.channels, self._device,
                held=lambda: not self._ptt.wait_press(self.stopped),
                max_utterance=cfg.max_utterance, min_utterance=cfg.min_utterance,
            )
        return audio.record_until_silence(
            cfg.samplerate, cfg.channels, self._device,
            silence_threshold=cfg.silence_threshold, silence_duration=cfg.silence_duration,
            min_utterance=cfg.min_utterance, max_utterance=cfg.max_utterance,
            stop=self.stopped,
        )
    def _handle(self, transcript: str) -> None:
        cfg = self.config
        require_wake = self.mode == "listen"
        action = grammar.parse(transcript, cfg.wake_phrases, cfg.match_threshold, require_wake)
        if action is None:
            self._emit(f'heard: "{transcript}" -> no command matched')
            return
        if action.name == "mode":
            new_mode = str(action.arg)
            if new_mode != self.mode:
                self.mode = new_mode
                self._emit(f"mode -> {new_mode}")
                self._refresh_state()
            return
        if action.name == "switch":
            session = target.set_target(str(action.arg))
            self._emit(f"target -> {session}")
            self._refresh_state()
            return
        session = target.resolve_target()
        if session is None:
            self._emit(f'heard: "{transcript}" -> matched: {self._describe(action)} '
                       f'-> ERROR no target session (did nothing)')
            return
        self._emit(f'heard: "{transcript}" -> matched: {self._describe(action)} -> target {session}')
        if action.name == "type" and not cfg.type_autosend:
            inject.send_literal(session, str(action.arg))
            self._emit(f"injected: literal {str(action.arg)!r} -> {session}")
            return
        inject.perform(session, action)
        self._emit(f"injected: {self._describe(action)} -> {session}")
    @staticmethod
    def _describe(action) -> str:
        if action.arg is None:
            return action.name.upper()
        return f"{action.name.upper()}({action.arg})"
    @staticmethod
    def _emit(line: str) -> None:
        """print a recognition/action line to the watched terminal"""
        print(line, flush=True)
    def _has_wake(self, transcript: str) -> bool:
        """true if the utterance starts with a wake phrase (listen-mode gate).
        non-wake speech is dropped without ever printing the transcript — the privacy
        invariant: non-command speech is discarded, never recorded.
        """
        cfg = self.config
        return grammar.strip_wake(transcript, cfg.wake_phrases, cfg.match_threshold, True) is not None
    def _print_startup(self) -> None:
        cfg = self.config
        dev = cfg.stt_device if cfg.stt_device != "auto" else "default"
        target_now = target.read_active() or "(none — run cc to attach)"
        self._emit("── claudedo ─────────────────────────────────")
        self._emit(f"  model:   {cfg.stt_model} ({cfg.stt_language})")
        self._emit(f"  mic:     {dev}")
        self._emit(f"  mode:    {self.mode}")
        self._emit(f"  target:  {target_now}")
        self._emit(f"  wake:    {', '.join(cfg.wake_phrases)}")
        self._emit("  Ctrl-C to stop")
        self._emit("─────────────────────────────────────────────")
    def _refresh_state(self) -> None:
        write_state(os.getpid(), self.mode, target.read_active())
    def run(self) -> None:
        """run the daemon loop until a stop signal arrives"""
        _ensure_state_dir()
        PIDFILE.write_text(str(os.getpid()), encoding="utf-8")
        self._install_signals()
        try:
            self._load()
            self._refresh_state()
            self._print_startup()
            while not self._stop:
                audio_chunk = self._capture()
                if self._stop:
                    break
                if audio_chunk is None:
                    continue
                transcript = self._transcriber.transcribe(audio_chunk, self.config.samplerate)
                if not transcript:
                    continue
                if self.mode == "listen" and not self._has_wake(transcript):
                    self._emit("dropped: non-wake speech (not recorded)")
                    continue
                self._handle(transcript)
        finally:
            PIDFILE.unlink(missing_ok=True)
            STATEFILE.unlink(missing_ok=True)
            log.info("claudedo stopped")
 def run_daemon(config: Config) -> None:
    """entry point used by the CLI ``start`` command"""
    if read_pid() is not None:
        raise RuntimeError("claudedo is already running (see `claudedo status`)")
    Daemon(config).run()
--- a/src/claudedo/grammar.py
+++ b/src/claudedo/grammar.py
@ -0,0 +1,159 @@
 """wake-phrase gate + command grammar matching (fuzzy, data-driven).
 the matcher is lenient by design: whisper renders the coined word "claudedo"
 inconsistently, so wake-phrase detection normalizes case, strips spaces/punctuation,
 and accepts close variants. number words are normalized to digits before matching.
 flow: transcript -> strip_wake() returns the command remainder (or None if no wake
 phrase in listen mode) -> match_command() maps the remainder to an Action.
 """
 from __future__ import annotations
 import re
 from dataclasses import dataclass
 from difflib import SequenceMatcher
 _PUNCT = re.compile(r"[^a-z0-9 ]+")
 _WS = re.compile(r"\s+")
 _NUMBER_WORDS = {
    "zero": "0", "oh": "0",
    "one": "1", "won": "1",
    "two": "2", "to": "2", "too": "2",
    "three": "3", "tree": "3",
    "four": "4", "for": "4", "fore": "4",
 }
 _INDEX_WORDS = {"1": 1, "2": 2, "3": 3, "4": 4}
@dataclass(frozen=True)
 class Action:
    """a matched command: a name plus an optional argument.
    names: yes, no, select, approve, deny, submit, type, mode, switch, cancel.
    arg carries the select index (int), the literal text for ``type``, the mode for
    ``mode``, or the session short-name for ``switch``.
    """
    name: str
    arg: object = None
 def normalize(text: str) -> str:
    """lowercase, strip punctuation, collapse whitespace, map number words to digits"""
    text = text.lower().strip()
    text = _PUNCT.sub(" ", text)
    text = _WS.sub(" ", text).strip()
    if not text:
        return ""
    tokens = [_NUMBER_WORDS.get(tok, tok) for tok in text.split(" ")]
    return " ".join(tokens)
 def _ratio(a: str, b: str) -> float:
    return SequenceMatcher(None, a, b).ratio()
 def _wake_variants(phrase: str) -> set[str]:
    """spaced and despaced forms of a wake phrase for lenient matching"""
    norm = normalize(phrase)
    return {norm, norm.replace(" ", "")}
 def strip_wake(transcript: str, wake_phrases: list[str], threshold: float,
               require_wake: bool) -> str | None:
    """return the command remainder after the wake phrase.
    if ``require_wake`` (listen mode) and no wake phrase is found at the start,
    return None so the daemon discards the utterance. if not required (ptt mode),
    a leading wake phrase is stripped when present but its absence is fine.
    matches leniently on a despaced prefix (whisper splits/joins the coined word
    inconsistently) but always slices the remainder on a WORD boundary of the
    spaced, normalized transcript — so the command portion keeps its spaces.
    """
    norm = normalize(transcript)
    if not norm:
        return None if require_wake else ""
    words = norm.split(" ")
    best_remainder: str | None = None
    best_score = 0.0
    for phrase in wake_phrases:
        variants = _wake_variants(phrase)
        max_words = phrase.count(" ") + 2
        for take in range(1, min(max_words, len(words)) + 1):
            head_despaced = "".join(words[:take])
            for variant in variants:
                if not variant:
                    continue
                score = _ratio(head_despaced, variant)
                if score >= threshold and score > best_score:
                    best_score = score
                    best_remainder = " ".join(words[take:]).strip()
    if best_remainder is not None:
        return best_remainder
    return None if require_wake else norm
 def _fuzzy_in(token: str, options: tuple[str, ...], threshold: float) -> bool:
    return any(_ratio(token, opt) >= threshold for opt in options)
 def match_command(remainder: str, threshold: float) -> Action | None:
    """map a normalized command remainder to an Action, or None if unrecognized"""
    remainder = remainder.strip()
    if not remainder:
        return None
    tokens = remainder.split(" ")
    head = tokens[0]
    rest = tokens[1:]
    if head in _INDEX_WORDS:
        return Action("select", _INDEX_WORDS[head])
    if _fuzzy_in(head, ("yes", "yeah", "yep", "yup"), threshold):
        return Action("yes")
    if _fuzzy_in(head, ("no", "nope", "nah"), threshold):
        return Action("no")
    if _fuzzy_in(head, ("approve", "allow"), threshold):
        return Action("approve")
    if _fuzzy_in(head, ("deny", "reject"), threshold):
        return Action("deny")
    if _fuzzy_in(head, ("send", "enter", "submit"), threshold):
        return Action("submit")
    if _fuzzy_in(head, ("cancel", "escape", "stop"), threshold):
        return Action("cancel")
    if _fuzzy_in(head, ("select", "option", "choose", "number"), threshold) and rest:
        if rest[0] in _INDEX_WORDS:
            return Action("select", _INDEX_WORDS[rest[0]])
    if _fuzzy_in(head, ("type", "dictate", "write"), threshold):
        text = " ".join(rest).strip()
        return Action("type", text) if text else None
    if _fuzzy_in(head, ("mode",), threshold) and rest:
        if _fuzzy_in(rest[0], ("ptt",), threshold) or "push" in rest[0]:
            return Action("mode", "ptt")
        if _fuzzy_in(rest[0], ("listen",), threshold):
            return Action("mode", "listen")
        return None
    if _fuzzy_in(head, ("switch", "target"), threshold) and rest:
        name = "".join(rest)
        return Action("switch", name) if name else None
    return None
 def parse(transcript: str, wake_phrases: list[str], threshold: float,
          require_wake: bool) -> Action | None:
    """full parse: wake gate then command match. None means discard"""
    remainder = strip_wake(transcript, wake_phrases, threshold, require_wake)
    if remainder is None:
        return None
    return match_command(remainder, threshold)
--- a/src/claudedo/inject.py
+++ b/src/claudedo/inject.py
@ -1,15 +1,24 @@
-"""inject keystrokes into a tmux session via ``tmux send-keys``.
+"""output handlers: resolve a grammar.Action to keystrokes and emit them.
-this is the ONLY mechanism by which claudedo affects claude code — PTY injection,
+the production handler (TmuxOutputHandler) injects via ``tmux send-keys`` — the ONLY
-never OS-level keyboard input. it works regardless of which window is focused and
+mechanism by which claudedo affects claude code. PTY injection, never OS-level
-never touches Windows input or a game/anticheat's view (it is text into a linux
+keyboard input: it works regardless of which window is focused and never touches
-pseudo-terminal). do not replace this with OS keystroke injection.
+Windows input or a game/anticheat's view (it is text into a linux pseudo-terminal).
 do not replace this with OS keystroke injection. this is also why claudedo is a
 standalone daemon and not an MCP server — MCP tools can only return content to claude,
 not inject into its input stream.
 StdoutOutputHandler prints what WOULD be injected instead of touching tmux, so the
 grammar + keymap can be exercised end-to-end without a live claude session — the
 deterministic test path. both implement the same OutputHandler seam and are
 interchangeable.
 """
 from __future__ import annotations
 import logging
 import subprocess
 from abc import ABC, abstractmethod
 from . import keys, target
@ -17,10 +26,62 @@ log = logging.getLogger(__name__)
 class InjectError(Exception):
-    """raised when a tmux send-keys call fails."""
+    """raised when a tmux send-keys call fails"""
-def _send_keys(session: str, args: list[str], literal: bool) -> None:
+class OutputHandler(ABC):
    """abstract sink for resolved keystrokes.
    concretes implement send_named (a sequence of named tmux keys) and send_literal
    (literal text, no submit). perform() maps a grammar.Action onto these and is shared
    by all handlers.
    """
    @abstractmethod
    def send_named(self, session: str, key_tokens: list[str]) -> None:
        """emit a sequence of named keys (e.g. ['1'] or ['Down', 'Enter'])"""
    @abstractmethod
    def send_literal(self, session: str, text: str) -> None:
        """emit literal text into the input box without submitting (``type``)"""
    def perform(self, session: str, action) -> bool:
        """resolve a grammar.Action to keystrokes and emit them. returns acted?.
        ``switch`` and ``mode`` are handled by the daemon (they change daemon state,
        not the claude session), so they are ignored here.
        """
        name = action.name
        if name == "yes":
            self.send_named(session, keys.YES)
        elif name == "no":
            self.send_named(session, keys.NO)
        elif name == "approve":
            self.send_named(session, keys.APPROVE)
        elif name == "deny":
            self.send_named(session, keys.DENY)
        elif name == "submit":
            self.send_named(session, keys.SUBMIT)
        elif name == "cancel":
            self.send_named(session, keys.CANCEL)
        elif name == "select":
            seq = keys.SELECT_BY_INDEX.get(int(action.arg))
            if seq is None:
                log.warning("no keymap for select index %r", action.arg)
                return False
            self.send_named(session, seq)
        elif name == "type":
            self.send_literal(session, str(action.arg))
        else:
            return False
        return True
 class TmuxOutputHandler(OutputHandler):
    """production handler — injects keystrokes into a tmux session via send-keys"""
    @staticmethod
    def _send_keys(session: str, args: list[str], literal: bool) -> None:
        cmd = ["tmux", "send-keys", "-t", session]
        if literal:
            cmd.append("-l")
@ -30,55 +91,68 @@ def _send_keys(session: str, args: list[str], literal: bool) -> None:
            err = result.stderr.decode("utf-8", "replace").strip()
            raise InjectError(f"tmux send-keys failed: {err}")
-
+    def send_named(self, session: str, key_tokens: list[str]) -> None:
 def send_named(session: str, key_tokens: list[str]) -> None:
    """send a sequence of named tmux keys (e.g. ['1'] or ['Down', 'Enter'])."""
        if not target.session_exists(session):
            log.warning("refusing to inject — session %r does not exist", session)
            return
        for token in key_tokens:
-        _send_keys(session, [token], literal=False)
+            self._send_keys(session, [token], literal=False)
        log.info("injected keys %s -> %s", key_tokens, session)
-
+    def send_literal(self, session: str, text: str) -> None:
 def send_literal(session: str, text: str) -> None:
    """insert literal text into the input box without submitting (``type``)."""
        if not text:
            return
        if not target.session_exists(session):
            log.warning("refusing to inject — session %r does not exist", session)
            return
-    _send_keys(session, [text], literal=True)
+        self._send_keys(session, [text], literal=True)
        log.info("injected literal text (%d chars) -> %s", len(text), session)
-def perform(session: str, action) -> bool:
+class StdoutOutputHandler(OutputHandler):
-    """resolve a grammar.Action to keystrokes and inject them. returns acted?.
+    """test handler — prints what would be injected instead of touching tmux.
-    ``switch`` and ``mode`` are handled by the daemon (they change daemon state, not
+    no session existence check (there is no real session); lets grammar + keymap be
-    the claude session), so they are ignored here.
+    exercised end-to-end without a live claude session. records the last emission on
    ``self.last`` for assertions.
    """
-    name = action.name
+
-    if name == "yes":
+    def __init__(self, stream=None) -> None:
-        send_named(session, keys.YES)
+        import sys
-    elif name == "no":
+
-        send_named(session, keys.NO)
+        self.stream = stream if stream is not None else sys.stdout
-    elif name == "approve":
+        self.last: tuple[str, object] | None = None
-        send_named(session, keys.APPROVE)
+
-    elif name == "deny":
+    def send_named(self, session: str, key_tokens: list[str]) -> None:
-        send_named(session, keys.DENY)
+        self.last = ("named", list(key_tokens))
-    elif name == "submit":
+        print(f"[stdout] keys {key_tokens} -> {session}", file=self.stream)
-        send_named(session, keys.SUBMIT)
+
-    elif name == "cancel":
+    def send_literal(self, session: str, text: str) -> None:
-        send_named(session, keys.CANCEL)
+        if not text:
-    elif name == "select":
+            return
-        seq = keys.SELECT_BY_INDEX.get(int(action.arg))
+        self.last = ("literal", text)
-        if seq is None:
+        print(f"[stdout] literal {text!r} -> {session}", file=self.stream)
-            log.warning("no keymap for select index %r", action.arg)
+
-            return False
+
-        send_named(session, seq)
+_default_handler: OutputHandler = TmuxOutputHandler()
-    elif name == "type":
+
-        send_literal(session, str(action.arg))
+
-    else:
+def set_default_handler(handler: OutputHandler) -> None:
-        return False
+    """swap the module-level handler the daemon drives (tmux in prod, stdout in tests)"""
-    return True
+    global _default_handler
    _default_handler = handler
 def send_named(session: str, key_tokens: list[str]) -> None:
    """module-level shim delegating to the default handler"""
    _default_handler.send_named(session, key_tokens)
 def send_literal(session: str, text: str) -> None:
    """module-level shim delegating to the default handler"""
    _default_handler.send_literal(session, text)
 def perform(session: str, action) -> bool:
    """module-level shim delegating to the default handler"""
    return _default_handler.perform(session, action)
--- a/src/claudedo/stt.py
+++ b/src/claudedo/stt.py
@ -0,0 +1,52 @@
 """faster-whisper wrapper: load a model once, transcribe audio chunks locally.
 privacy invariant: transcription runs entirely on-device. audio handed here is a
 short in-memory chunk; nothing is written to disk or sent anywhere.
 """
 from __future__ import annotations
 import logging
 import numpy as np
 log = logging.getLogger(__name__)
 class Transcriber:
    """a loaded faster-whisper model that transcribes float32 mono audio chunks"""
    def __init__(self, model: str = "small", language: str = "en", device: str = "auto",
                 compute_type: str = "auto") -> None:
        self.language = language
        self._model = self._load(model, device, compute_type)
    @staticmethod
    def _load(model: str, device: str, compute_type: str):
        from faster_whisper import WhisperModel
        if device == "auto":
            device = "cpu"
        if compute_type == "auto":
            compute_type = "int8" if device == "cpu" else "float16"
        log.info("loading faster-whisper model=%s device=%s compute=%s", model, device, compute_type)
        return WhisperModel(model, device=device, compute_type=compute_type)
    def transcribe(self, audio: np.ndarray, samplerate: int = 16000) -> str:
        """transcribe a mono float32 numpy array to a stripped text string.
        the audio must be 16 kHz mono float32 in [-1, 1]; resample upstream if not.
        """
        if audio.dtype != np.float32:
            audio = audio.astype(np.float32)
        if audio.ndim > 1:
            audio = audio.reshape(-1)
        segments, _info = self._model.transcribe(
            audio,
            language=self.language,
            beam_size=1,
            vad_filter=True,
            condition_on_previous_text=False,
        )
        text = " ".join(seg.text for seg in segments).strip()
        return text
--- a/src/claudedo/target.py
+++ b/src/claudedo/target.py
@ -1,4 +1,4 @@
-"""resolve the active claude code tmux session from ~/.claude-active."""
+"""resolve the active claude code tmux session from ~/.claude-active"""
 from __future__ import annotations
@ -26,7 +26,7 @@ def session_name(name: str) -> str:
 def read_active() -> str | None:
-    """return the target session name from ~/.claude-active, or None if unset."""
+    """return the target session name from ~/.claude-active, or None if unset"""
    try:
        name = ACTIVE_FILE.read_text(encoding="utf-8").strip()
    except FileNotFoundError:
@ -38,7 +38,7 @@ def read_active() -> str | None:
 def write_active(name: str) -> None:
-    """overwrite ~/.claude-active with a session name (used by ``switch``)."""
+    """overwrite ~/.claude-active with a session name (used by ``switch``)"""
    ACTIVE_FILE.write_text(name + "\n", encoding="utf-8")
@ -51,7 +51,7 @@ def set_target(name: str) -> str:
 def session_exists(name: str) -> bool:
-    """true if a tmux session with this name currently exists."""
+    """true if a tmux session with this name currently exists"""
    if not name:
        return False
    result = subprocess.run(
@ -67,6 +67,13 @@ def resolve_target() -> str | None:
    never guesses a target: on a missing/empty ~/.claude-active or a stale session
    name, this logs a clear warning and returns None so the caller injects nothing.
    TODO: most-recently-active targeting (preferred over attached). today the target
    is the project most recently ATTACHED to (the cc kit writes ~/.claude-active on
    attach); upgrade to the session claude most recently asked a question in, via
    tmux session_activity timestamps (list-sessions -F '#{session_name}
    #{session_activity}', pick the highest-activity claude-* session) or by scraping
    panes (capture-pane) for a waiting-prompt UI.
    """
    name = read_active()
    if not name:
@ -76,13 +83,3 @@ def resolve_target() -> str | None:
        log.warning("target session %r no longer exists — skipping injection", name)
        return None
    return name
 # TODO: most-recently-active targeting (preferred over attached). today the target
 # is "the project most recently ATTACHED to" (the cc kit writes ~/.claude-active on
 # attach). upgrade to "the session claude most recently asked a question / produced
 # output in" via tmux session_activity timestamps:
 #     tmux list-sessions -F '#{session_name} #{session_activity}'
 # pick the highest-activity claude-* session; or scrape panes
 # (tmux capture-pane -p -t <s>) for a waiting-prompt UI and target the session whose
 # pane currently shows one.
Author	SHA1	Message	Date
disqualifier	17db65858e	feat: terminal-run only — drop systemd/autostart, start does mic-check + visible loop terminal-run is the product, so remove all backgrounding: delete the claudedo.service unit and autostart.sh, strip the systemd step and the autostart source-line from install.sh (rc block now sources cc.sh only). claudedo start now runs a mic check first (warm-up + brief capture, aborts with guidance if silent; --skip-audio-check to bypass) then drops into a visible listen loop printing the recognition/action log: a startup banner, then heard -> matched -> target / injected per utterance, target/mode state changes, and (listen mode) non-wake speech dropped WITHOUT the transcript per the privacy invariant. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 19:30:36 -04:00
disqualifier	eb587692e1	fix: prime mic to skip RDPSource resume gap WSLg's RDPSource suspends when idle and emits ~1-2s of silence while it resumes on the first read, so a short timed capture (test-audio) or the first utterance after daemon start could be lost. add audio.warm_up() that opens a stream and reads until a non-silent block arrives (or times out); call it at daemon startup and before test-audio's capture. test-audio now primes then captures 3s. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 19:09:08 -04:00
disqualifier	84c74603e5	feat: output-handler seam with tmux and stdout handlers extract an OutputHandler abstract base; TmuxOutputHandler is production (send-keys, PTY-only), StdoutOutputHandler prints what would be injected so grammar+keymap run end-to-end without a live claude session (the deterministic test path). module-level shims default to tmux so the daemon is unchanged. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 18:42:34 -04:00
disqualifier	d43004e4b9	feat: tmux send-keys settings in install.sh bootstrap append escape-time 0, large history-limit, allow-passthrough, and extended-keys to ~/.tmux.conf under an idempotent marker block (no clobber). required for reliable keystroke injection and for notifications/modified-keys to reach the claude pane. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 18:42:26 -04:00
disqualifier	66b08d290c	docs: lead how-to-run with the terminal-run model state terminal-run as the product (the claudedo start terminal is the recognition/action console) and frame backgrounding/autostart/systemd as optional extras, not the default. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 18:42:22 -04:00
disqualifier	7f4a6f6699	style: drop inline comments, trim docstring periods remove inline comments (CLAUDE.md: docstrings only), strip trailing periods from single-line docstrings, and fix a PulseArmy->PulseAudio typo. no behavior change. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 18:42:17 -04:00
disqualifier	bf516143b5	install: shell cc kit, opt-in autostart, bootstrap cc kit as a sourced ~/.config/claudedo/cc.sh (bash+zsh, forced explicit names). opt-in rc autostart guarded by CLAUDEDO_AUTOSTART + an optional systemd user unit. install.sh is idempotent: WSL audio deps, ~/.asoundrc pulse shim, audio verify, model prime, and source-line rc wiring with backups. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 17:55:30 -04:00
disqualifier	7780a8d47c	daemon: capture->stt->match->inject loop and CLI daemon.py runs the loop with pidfile/state, runtime mode switching, and the privacy invariant: in listen mode any non-wake utterance is dropped the instant grammar.parse() returns None. __main__.py exposes start\|stop\|status\|test-audio\| install\|switch. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 17:55:25 -04:00
disqualifier	947b30c22e	grammar: fuzzy wake gate and command matching word-boundary wake stripping that's lenient on the coined word 'claudedo' (despaced-prefix match) without swallowing the command's spaces. data-driven phrase->action map; number words normalized to digits; 'target' aliases 'switch'. Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 17:55:21 -04:00
disqualifier	da7c39c4f2	audio: local STT and mic capture stt.py wraps faster-whisper for fully on-device transcription. audio.py captures via sounddevice with two paths: silence-segmented for listen mode and held-key for ptt. resolves the input device from config (auto/index/name). Signed-off-by: disqualifier <dev@disqualifier.me>	2026-06-25 17:55:17 -04:00
`@ -1,3 +1,3 @@`
	`"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)."""`	`"""claudedo — voice-control daemon for claude code (local STT -> tmux send-keys)"""`

	`__version__ = "0.1.0"`	`__version__ = "0.1.0"`