Compare commits

...

11 Commits
v0.1.0 ... main

Author SHA1 Message Date
b00f122b74 chore: ignore .claude/ dir (CLAUDE.md now lives under .claude/)
Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 21:55:13 -04:00
3da833f2fc fix: revert OTP-in-logs (spent on arrival, not a secret); F1 ContentTypeError, F5 callable None-guard
revert the M-1 log change — a single-use OTP is consumed on arrival, not a live secret,
so log the code value again. keep the oauth error-body truncate.

F1: oauth token fetch uses resp.json(content_type=None) so a 200 with text/plain doesn't
ContentTypeError and discard a valid token. F5: as_predicate coalesces None for the
callable branch like the string/regex branches. drop a redundant digits.isdigit().

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 21:34:35 -04:00
f940641a5a fix: never log the OTP code value (secret-in-logs); correct false test claim (v0.1.5)
M-1: retrieve.py logged the live single-use code at INFO ('found code %s', 'code %s
skipped too old'), shipping the secret to any aggregation/retention sink the host wires
(our /srv/logs -> loki/grafana path). drop the code value from both lines — log that a
code was found/retrieved and where, never the value. also truncate the oauth token-endpoint
error body to 200 chars so a token response can't be dumped whole.

aiomail-F3: CLAUDE.md claimed an '8-case tested' suite that does not exist in the repo;
corrected to describe the manual throwaway-venv exercise + the real flake8 check.

verified by execution: code retrieved, value absent from logs; control confirms the old
line carried it.

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 20:46:51 -04:00
0cf23805dd docs: pin install line to release, note unpinned-latest option
Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 18:13:31 -04:00
75e6550311 docs: show unpinned install line; note tag-pinning for reproducibility
Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 18:07:16 -04:00
a44bf11be6 fix: dead is_throttled, orphan connect-task, server-defined folder delimiter (v0.1.4)
- remove is_throttled(): read a non-existent .resp -> always False (dead) (L2)
- cancel/await aioimaplib's fire-and-forget create_connection task on a failed connect
  so a refused host doesn't log 'Task exception was never retrieved' per retry (L3)
- get_folders() parses the server-announced LIST delimiter instead of hardcoding '/',
  so '.'/NIL-delimited servers (Gmail/Dovecot) return correct names (L4)
- mark the dead aioimaplib-2.0.x tuple branch + the non-aioimaplib authenticate
  fallback as cross-version escape hatches (nits).

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 17:57:37 -04:00
e349638700 fix: fetch() selects message body by structure, not length (v0.1.3)
select the literal payload by isinstance bytearray instead of len>20. aioimaplib
stores the message body as the only bytearray in the response; every other line
(including the '<id> FETCH (...' header) is plain bytes. the length heuristic
matched the header line first for any 2+ digit message id or BODY[]/UID fetch,
returning a blank Message and silently breaking OTP retrieval on real mailboxes.

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 17:09:08 -04:00
a4abe354eb fix: add match_field=from|to to restore recipient-primary OTP matching
the clean lib matched senders by From only; the original imap_tool.py matched primarily by TO (the per-user alias the code was sent to) with a HEADER FROM forwarded fallback. added match_field="from"|"to" to retrieve_otp: "from" (default) is byte-identical to current behavior, "to" searches TO primary and accepts a forwarded From match, restoring the alias flow. server query + client-side predicate both honor it. bump to v0.1.2.

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-29 03:25:18 -04:00
6ac8957583 fix: pass XOAUTH2 token as str, not bytes (was corrupting the Bearer value)
aioimaplib's mail.xoauth2(user, token) builds the SASL string by f-string interpolating the token, so a bytes token injects the b'...' repr into auth=Bearer and breaks every XOAUTH2 login. dropped the .encode() (token is already str via _resolve_token/_as_str). corrected the inline comment and the CLAUDE.md note that both wrongly claimed bytes was required — that false note propagated the bug through prior review passes.

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-28 18:45:25 -04:00
7934688595 fix: clean up the IMAP object on a failed connect; module-level asyncio import
on a failed connect attempt the IMAP4 object was dropped without logout(), leaking the socket aioimaplib held; now it is logged out (best-effort) before nulling. also moved the asyncio import in oauth.py from inside the retry loop to module top.

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-28 17:18:28 -04:00
ba7ae48a87 fix: bytes-token XOAUTH2 + tolerant SEARCH parse (v0.1.1)
- a token provider (or static token) returning bytes crashed token.encode() in the
  XOAUTH2 path. coerce to str at the source (_resolve_token via _as_str) so both
  the .encode() and SASL-fallback entrypoints get a str.
- client.search() did int(x) on every SEARCH token unguarded; a malformed/non-
  numeric token aborted the whole search. skip non-numeric tokens (log at debug)
  instead of crashing.

verified by execution: bytes static + async-provider tokens authenticate without
crashing (both xoauth2 and SASL-fallback paths); guarded search skips garbage.

Signed-off-by: disqualifier <dev@disqualifier.me>
2026-06-27 21:51:27 -04:00
9 changed files with 121 additions and 41 deletions

2
.gitignore vendored
View File

@ -1,5 +1,5 @@
# claude
CLAUDE.md
.claude/
# python
__pycache__/

View File

@ -11,21 +11,23 @@ This reads codes from email; it does not generate them (that is `pyotp`'s job).
`requirements.txt`:
```
aiomail @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.0
aiomail @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.5
# OAuth token providers (Microsoft / Google) need the extra:
aiomail[oauth] @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.0
aiomail[oauth] @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.5
```
Direct:
```bash
pip install "aiomail @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.0"
pip install "aiomail[oauth] @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.0"
pip install "aiomail @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.5"
pip install "aiomail[oauth] @ git+ssh://git@git.rethinkstudios.io/rethink-public/aiomail.git@v0.1.5"
```
Requires `aioimaplib` and `beautifulsoup4` (pulled transitively). The `oauth`
extra adds `aiohttp` for the refresh-token providers.
Drop the `@v0.1.5` suffix from the line above to install the latest unpinned.
## Password auth
```python

View File

@ -4,7 +4,7 @@ build-backend = "hatchling.build"
[project]
name = "aiomail"
version = "0.1.0"
version = "0.1.5"
description = "async IMAP one-time-code retrieval with password/OAuth2 auth and dynamic matching"
requires-python = ">=3.10"
dependencies = [

View File

@ -29,4 +29,4 @@ __all__ = [
"DEFAULT_FOLDERS",
]
__version__ = "0.1.0"
__version__ = "0.1.5"

View File

@ -38,6 +38,15 @@ class PasswordAuth:
raise RuntimeError(f"login failed: {result} {data}")
def _as_str(token) -> str:
"""coerce a token to str (a provider may hand back bytes)
both XOAUTH2 entrypoints downstream need a str (one .encode()s it, the SASL
builder interpolates it), so normalize here rather than crashing on bytes.
"""
return token.decode() if isinstance(token, bytes) else token
def _sasl_xoauth2(user: str, token: str) -> str:
"""build the base64 XOAUTH2 SASL initial-response string"""
raw = f"user={user}\x01auth=Bearer {token}\x01\x01".encode()
@ -71,18 +80,23 @@ class OAuth2Auth:
token = await result if hasattr(result, "__await__") else result
if not token:
raise RuntimeError("token provider returned an empty token")
return token
return self._token # type: ignore[return-value]
return _as_str(token)
return _as_str(self._token) # type: ignore[arg-type]
async def authenticate(self, mail) -> None:
token = await self._resolve_token()
# aioimaplib exposes mail.xoauth2(user, token: bytes) — note the token must
# be bytes, not str. older/other clients that lack it but expose a generic
# authenticate() are driven via the SASL string from _sasl_xoauth2.
# aioimaplib's mail.xoauth2(user, token) builds the SASL string by f-string
# interpolating the token, so token MUST be str — passing bytes interpolates
# the b'...' repr and corrupts the Bearer value. _resolve_token already
# returns str (via _as_str). clients lacking .xoauth2 are driven via the
# SASL callback from _sasl_xoauth2.
xoauth2 = getattr(mail, "xoauth2", None)
if xoauth2 is not None:
result, data = await xoauth2(self.user, token.encode())
result, data = await xoauth2(self.user, token)
elif hasattr(mail, "authenticate"):
# escape hatch for a non-aioimaplib client: the shipped aioimaplib IMAP4
# always has .xoauth2 and never .authenticate, so this branch never runs
# for it; the SASL-callback signature here is untested against any driver
result, data = await mail.authenticate(
"XOAUTH2", lambda _: _sasl_xoauth2(self.user, token)
)

View File

@ -9,6 +9,7 @@ import asyncio
import email
import email.message
import logging
import re
from typing import List, Optional
from aioimaplib import IMAP4, IMAP4_SSL
@ -17,6 +18,22 @@ from .auth import Auth
log = logging.getLogger(__name__)
# IMAP LIST reply: (flags) "<delim>" <name> — delim is server-defined (often "/" or
# "." or NIL); capture the trailing name regardless, quoted or bare
_LIST_RE = re.compile(rb'^\([^)]*\)\s+(?:"[^"]*"|NIL)\s+(.+)$')
def _folder_name(raw: bytes) -> str:
"""extract the folder name from a LIST reply line, delimiter-agnostic
parses the real reply form `(flags) "<delim>" <name>` so any server hierarchy
delimiter works (not just "/"); falls back to the last quoted/space token if the
line doesn't match the canonical shape.
"""
match = _LIST_RE.match(raw.strip())
name = match.group(1).decode() if match else raw.decode().rsplit(" ", 1)[-1]
return name.strip().strip('"')
class IMAPClient:
"""connection-managing IMAP client driven by an injected auth mechanism
@ -67,10 +84,34 @@ class IMAPClient:
return True
except Exception as exc:
log.warning("connect attempt %d/%d failed: %s", attempt + 1, self.max_retries, exc)
self._mail = None
if self._mail is not None:
await self._discard_mail(self._mail)
self._mail = None
await asyncio.sleep(2 * (attempt + 1))
return False
@staticmethod
async def _discard_mail(mail) -> None:
"""tear down a half-built IMAP4 without leaking its connect task
aioimaplib's IMAP4 schedules `create_connection` as a fire-and-forget task it
never retrieves; on a refused connection that task raises and asyncio logs a
noisy "Task exception was never retrieved" traceback. cancel/await it here (and
retrieve its exception) before discarding, so a failed connect stays quiet.
"""
task = getattr(mail, "_client_task", None)
if task is not None and not task.done():
task.cancel()
if task is not None:
try:
await task
except (asyncio.CancelledError, Exception):
pass
try:
await mail.logout()
except Exception as teardown:
log.debug("logout error ignored during failed connect: %s", teardown)
async def close(self) -> None:
"""log out and drop the connection, swallowing teardown errors"""
if self._mail is not None:
@ -90,14 +131,6 @@ class IMAPClient:
except Exception:
return await self.connect()
def is_throttled(self) -> bool:
"""best-effort detection of a provider throttling response"""
return bool(
self._mail is not None
and getattr(self._mail, "resp", None)
and "THROTTLED" in str(self._mail.resp)
)
async def get_folders(self) -> List[str]:
"""list mailbox folder names"""
if not await self.ensure_connection():
@ -110,7 +143,7 @@ class IMAPClient:
folders: List[str] = []
for folder in folder_list or []:
try:
folders.append(folder.decode().split(' "/" ')[-1].strip('"'))
folders.append(_folder_name(folder))
except Exception:
continue
return folders
@ -138,7 +171,14 @@ class IMAPClient:
return []
if result != "OK" or not data or not data[0]:
return []
ids = [int(x) for x in data[0].split()]
ids = []
for token in data[0].split():
try:
ids.append(int(token))
except (TypeError, ValueError):
# tolerate a malformed/non-numeric token in the SEARCH response
# instead of crashing the whole search
log.debug("skipping non-numeric search token: %r", token)
return sorted(set(ids), reverse=True)
async def fetch(self, email_id: int, *, icloud: bool = False) -> Optional[email.message.Message]:
@ -157,8 +197,14 @@ class IMAPClient:
if result != "OK" or not data:
return None
for item in data:
if isinstance(item, (bytes, bytearray)) and len(item) > 20:
# aioimaplib stores the literal message payload as the only bytearray in
# the response; every other line (including the `<id> FETCH (...` header)
# is plain bytes. select by structure, not length — a length heuristic
# mismatches the header line for any 2+ digit id or a BODY[]/UID fetch.
if isinstance(item, bytearray):
return email.message_from_bytes(bytes(item))
# cross-version fallback: aioimaplib 2.0.x never yields tuples here, but an
# imaplib-style (header, payload) tuple is handled if a future/alt driver does
if isinstance(item, tuple) and len(item) > 1:
return email.message_from_bytes(item[1])
return None

View File

@ -76,7 +76,7 @@ def _scan(text: str, patterns: list[Pattern], lengths: set[int]) -> Optional[str
return m.group(1) if m.groups() else m.group(0)
for token in re.split(r"\s+", text):
digits = "".join(c for c in token if c.isdigit())
if digits and len(digits) in lengths and digits.isdigit():
if digits and len(digits) in lengths:
return digits
return None
@ -121,6 +121,8 @@ def as_predicate(spec: MatchSpec) -> Callable[[Optional[str]], bool]:
if isinstance(spec, re.Pattern):
return lambda value: bool(spec.search(value or ""))
if callable(spec):
return spec
# coalesce None like the string/regex branches so the documented Optional[str]
# predicate contract holds even if a caller's callable assumes a real string
return lambda value: bool(spec(value or ""))
needle = str(spec).lower()
return lambda value: needle in (value or "").lower()

View File

@ -6,6 +6,7 @@ without aiohttp raises a clear error only when a provider is instantiated.
credentials (client_id, refresh_token) are always supplied by the caller.
"""
import asyncio
import logging
import time
from typing import Optional, Sequence
@ -75,18 +76,21 @@ class _RefreshTokenProvider:
async with aiohttp.ClientSession(timeout=timeout) as session:
async with session.post(endpoint, data=data) as resp:
if resp.status == 200:
token = (await resp.json()).get("access_token")
# content_type=None: some token endpoints return a 200 with
# text/plain or text/javascript; default json() would raise
# ContentTypeError and discard a valid token body
token = (await resp.json(content_type=None)).get("access_token")
if token:
self._failures = 0
return token
else:
body = await resp.text()
# log a truncated error body only — a token-endpoint
# response can carry sensitive material; never dump it whole
body = (await resp.text())[:200]
log.warning("token endpoint %s -> %s: %s", endpoint, resp.status, body)
except Exception as exc:
log.warning("token request to %s failed: %s", endpoint, exc)
if attempt < self.max_retries - 1:
import asyncio
await asyncio.sleep(2 ** attempt)
self._failures += 1

View File

@ -20,16 +20,18 @@ log = logging.getLogger(__name__)
DEFAULT_FOLDERS: Sequence[str] = ("INBOX", "Junk", "Spam", "Archive", "All Mail")
def _server_query(sender: MatchSpec, subject: MatchSpec) -> str:
def _server_query(sender: MatchSpec, subject: MatchSpec, match_field: str = "from") -> str:
"""build a narrowing IMAP query from plain-string specs only
only plain strings translate to server-side FROM/SUBJECT filters; regex and
callable specs fall back to ALL and are filtered client-side, so dynamic
matching always works even when the server cannot express it.
only plain strings translate to server-side filters; regex and callable specs
fall back to ALL and are filtered client-side, so dynamic matching always works
even when the server cannot express it. `match_field` selects which header the
`sender` spec searches: "from" filters by the sender address (default), "to"
filters by the recipient address (the per-user alias the code was sent to).
"""
parts: List[str] = []
if isinstance(sender, str):
parts.append(f'FROM "{sender}"')
parts.append(f'TO "{sender}"' if match_field == "to" else f'FROM "{sender}"')
if isinstance(subject, str):
parts.append(f'SUBJECT "{subject}"')
return f"({' '.join(parts)})" if parts else "ALL"
@ -52,6 +54,7 @@ async def retrieve_otp(
*,
sender: MatchSpec = None,
subject: MatchSpec = None,
match_field: str = "from",
folders: Optional[Iterable[str]] = None,
patterns: Sequence[Union[str, Pattern]] = DEFAULT_PATTERNS,
lengths: Iterable[int] = DEFAULT_LENGTHS,
@ -64,14 +67,18 @@ async def retrieve_otp(
) -> Optional[str]:
"""return the newest OTP matching the filters, or None
sender/subject accept a substring, a compiled regex, or a callable. folders,
patterns, code lengths, max age and retry behavior are all tunable. set
`max_age=None` to disable the freshness check.
sender/subject accept a substring, a compiled regex, or a callable. `match_field`
selects which header the `sender` spec is matched against: "from" (default)
matches the sender address; "to" matches the recipient address (the per-user
alias the code was sent to) and additionally accepts a forwarded match on the
From header, so a forwarded code still resolves. folders, patterns, code lengths,
max age and retry behavior are all tunable. set `max_age=None` to disable the
freshness check.
"""
folders = list(folders) if folders is not None else list(DEFAULT_FOLDERS)
sender_ok = as_predicate(sender)
subject_ok = as_predicate(subject)
query = _server_query(sender, subject)
query = _server_query(sender, subject, match_field)
for attempt in range(retries + 1):
for folder in folders:
@ -91,7 +98,12 @@ async def retrieve_otp(
from_hdr = message.get("From", "")
subj_hdr = message.get("Subject", "")
if not sender_ok(from_hdr) or not subject_ok(subj_hdr):
if match_field == "to":
to_hdr = message.get("To", "")
matched = sender_ok(to_hdr) or sender_ok(from_hdr)
else:
matched = sender_ok(from_hdr)
if not matched or not subject_ok(subj_hdr):
continue
code = extract_code(message, patterns=patterns, lengths=lengths)