2026-06-07

Weekly open source: a pure-Rust ICU, and rsurl reaches curl parity

Last week the pure-Rust, no-C stack went public — crypto, compression, SSH, a curl. This week it grew a new pillar: internationalization. intl is a from-scratch analog of ICU, no_std and pure Rust, with the official Unicode conformance suites passing 100%. Two no_std date crates landed beside it. And the stack consolidated: rsurl reached functional curl parity, and puressh moved onto the current purecrypto.

intl: a pure-Rust ICU

intl — "pure-Rust, #![no_std] internationalization primitives, a pure-Rust analog of ICU" — is the headline. It targets Unicode 17.0.0, and the core algorithms ship with their official conformance suites passing 100%: Normalization, Collation, the Grapheme/Word/Sentence and Line breaking, and Bidi. The design is unusual in a nice way: the Unicode Character Database is compiled directly into Rust match dispatch by an offline generator, so every property lookup is a const fn that allocates nothing and needs no runtime init — tables as code, not parsed at startup. It's no_std throughout and usable with no allocator at all, with feature-selectable codepoint ranges for embedded / kernel / WASM.

The week took it from v0.1.3 to v0.3.1, broad and then hardened:

  • Collation — data-driven tailoring from official CLDR rules, an alphabetic index (index_labels / index_bucket), and primary-strength string search (find / contains).
  • Transliteration — set quantifiers and match-references in the transform engine, plus Armenian and Georgian romanizations.
  • Number spell-out — RBNF ordinal spell-out (spell_ordinal), with adversarial-input stack-overflow fixes.
  • IDNA — full IDNA2008 validity (CheckBidi / ContextJ / Hyphens / V1 / V5 / V6), VerifyDnsLength, and an honest conformance bar.
  • Spoof detection — UTS #39 confusable skeletons and single-script resolution via Script_Extensions.

Then a dedicated security pass closed a row of denial-of-service holes the conformance work surfaced: quadratic scans in collation and Bidi made linear or O(1), memoized sentence-break lookahead, Punycode decode-bomb and domain-length caps, MessageFormat recursion-depth caps, and checked arithmetic across POSIX TZ offset parsing and GMT-offset formatting (several latent i32 overflow panics).

timezone-data + strtotime

Two smaller no_std crates landed in the same family, both ports of existing Go libraries:

timezone-data (a port of gotz) exposes the IANA timezone data most libraries keep private — transitions, zone types, POSIX TZ rules, leap seconds. All 598 zones are pre-parsed at build time into static Rust objects and embedded, so there's no runtime parsing, no build script, no host-tzdata dependency, and zero external crates: a lookup is a binary search over &'static data, allocation-free.

strtotime (a port of strtotime, which mirrors PHP's strtotime()) parses PHP-style date/time expressions into a Unix timestamp without allocating — tokens borrow slices of the input, the working set lives on the stack. It validates against a 669-case corpus captured from real PHP and matches on every one, including DST-sensitive results, fall-forward across DST gaps, and extreme-year int64 overflow. DST awareness comes from timezone-data; microsecond sub-second fractions landed too.

rsurl: curl parity, reached

rsurl — the pure-Rust curl on purecrypto + compcol — spent the week closing its feature-parity roadmap and declared functional curl parity complete under the no-C invariant (v0.0.4v0.0.6). The landings, roughly in roadmap order:

  • Protocols — SMTP/SMTPS sending and a minimal TELNET client, on top of the 14 from last week.
  • Auth — HTTP Digest, AWS SigV4 request signing (--aws-sigv4), --oauth2-bearer.
  • TLS — mTLS client certs, public-key pinning, --capath / --crlfile (CRL revocation), and --tlsv1.x / --tls-max version pinning, on both TLS backends.
  • Transfers — streaming HTTP/1.1, HTTP/2, HTTP/3 and FTP/FTPS downloads to a sink (so bodies never fully reside in memory), streaming decompression, -Z/--parallel concurrent transfers, URL globbing ({a,b} and [1-100] ranges), --unix-socket, --connect-to, redirect controls, and the retry family.
  • CLI surface-w phase timers and %header{} interpolation, --json, low-speed abort (-y/-Y), -K config files, --next multi-operation, plus ten more libcurl-shaped options across the C ABI and a man/rsurl.1 page.

puressh: ext-info and more zeroize

puressh (v0.0.4) added RFC 8308 extension negotiation — the ext-info-c / ext-info-s markers in KEXINIT, sending and accepting SSH_MSG_EXT_INFO at the legal protocol moments, and preferring the server's advertised server-sig-algs for public-key auth. It also continued the security grind: zeroizing the ChaCha20-Poly1305 K_1/K_2 and Poly1305 one-time key, wiping DH/ECDH shared-secret scratch, clamping pcssh_sftp_read copies to the caller's cap, and a signal-safe termios restore in the raw-mode guard. It moved onto the current crypto floor too: purecrypto 0.20.6.1.

OxideAV: a renderer joins, and a terseness sweep

oxideav-render is new — a pure-Rust 3D-scene → raster image/video renderer for the framework. It came up in phases over the week: a Renderer trait + backend stub, a scanline backend behind make_renderer(Scanline), a named-backend RenderRegistry, and a RenderSource: FrameSource bridge so a rendered scene plugs into the transcode pipeline as a source (the raycaster is still pending). It's wired into the workspace under cli-convert + mesh3d.

The weekly multi-agent codec sweep continued (rounds → 253), and the workspace README got a "full-table terseness sweep" — stripping the per-round changelog chains and collapsing the 100%-done rows, now that so many crates are saturated.

Next week

intl keeps filling in the ICU surface above the Unicode core; timezone-data and strtotime settle their APIs. rsurl, parity reached, turns to hardening and the long tail of compatibility no-ops. puressh keeps filling in state machines past scaffolding. oxideav-render starts on the raycaster behind the scanline path.