Skip to content

Prevent silent corruption of non-ASCII outbound SMS bodies #6

Description

@Justinabox

Summary

SMSService.send() silently corrupts non-ASCII SMS text before handing it to the modem. The service configures GSM text mode/charset, but then encodes the payload with Python ASCII plus errors="replace", turning GSM-supported characters like é and all UCS2-only characters into ? without warning.

For SMS automation, silent body mutation is dangerous: MFA text, names, carrier messages, and international content can be sent incorrectly while the API reports success.

Evidence

Affected code:

  • callstack/sms/service.py:85-92 initializes SMS text mode and AT+CSCS="GSM".
  • callstack/sms/service.py:112-117 sends the body as:
f"{body}\x1A".encode("ascii", errors="replace")

Minimal reproduction of the exact encoding behavior run during scouting:

PYTHONPATH=. python3 - <<'PY'
body = 'Café Ω 中'
payload = f'{body}\x1A'.encode('ascii', errors='replace')
print(payload)
print(payload.decode('ascii'))
PY

Observed output:

b'Caf? ? ?\x1a'
Caf? ? ?�

This happens before the modem sees the data, so a successful +CMGS would still persist/report the original body in memory while the recipient receives a mutated message.

Repository health checks from this scout run:

git diff --check
# exit 0

PYTHONPATH=. uv run --no-project --with pytest --with pytest-asyncio --with pytest-aiohttp --with pyserial-asyncio --with aiosqlite pytest tests/ -q
# 288 passed in 3.94s

Duplicate check:

gh search issues --repo Justinabox/Callstack "SMS Unicode non-ASCII ascii replace UCS2" --state open
# []

Expected behavior

SMS sending should either preserve supported message content or fail explicitly:

  • GSM 03.38 characters (including extension table characters) should be encoded correctly.
  • UCS2-required text should either switch to a supported UCS2/PDU send path or raise a clear error that the current send mode cannot represent it.
  • The persisted SMS.body should match what was actually sent, or the service should store both requested and encoded/sent body explicitly.

Actual behavior

Any character outside ASCII is replaced with ? silently before send_data(). The method can then return status="sent" and save the original body even though the transmitted bytes are different.

Suggested fix direction

  • Add regression tests for at least one GSM 03.38 non-ASCII character (é) and one UCS2-required string.
  • Replace ASCII replacement with an explicit SMS encoding layer:
    • GSM 7-bit text mode where safe, including extension table handling; or
    • PDU/UCS2 mode for non-GSM text; or
    • explicit SMSSendError for unsupported characters until UCS2/PDU sending is implemented.
  • Avoid errors="replace" for outbound SMS unless the caller explicitly opts into lossy transliteration.

Acceptance criteria

  • Sending Café no longer silently transmits Caf?.
  • UCS2-required text either sends correctly or fails before contacting the modem with a clear exception.
  • Tests assert the raw bytes passed to send_data() or the raised exception for unsupported content.
  • Stored message metadata cannot claim the original body was sent if the transport payload differed.

Verification gates

git diff --check
PYTHONPATH=. uv run --no-project --with pytest --with pytest-asyncio --with pytest-aiohttp --with pyserial-asyncio --with aiosqlite pytest tests/ -q

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions