Weekly open source: a pure-Rust ICU, and rsurl reaches curl parity
Last week the pure-Rust, no-C stack went public — crypto,
compression, SSH, a curl. This week it grew a new pillar:
internationalization. intl is a from-scratch analog of ICU,
no_std and pure Rust, with the official Unicode conformance suites
passing 100%. Two no_std date crates landed beside it. And the
stack consolidated: rsurl reached functional curl parity, and
puressh moved onto the current purecrypto.
intl: a pure-Rust ICU
intl — "pure-Rust,
#![no_std] internationalization primitives, a pure-Rust analog
of ICU" — is the headline. It targets Unicode 17.0.0, and the core
algorithms ship with their official conformance suites passing
100%: Normalization, Collation, the Grapheme/Word/Sentence and
Line breaking, and Bidi. The design is unusual in a nice way: the
Unicode Character Database is compiled directly into Rust match
dispatch by an offline generator, so every property lookup is a
const fn that allocates nothing and needs no runtime init —
tables as code, not parsed at startup. It's no_std throughout
and usable with no allocator at all, with feature-selectable
codepoint ranges for embedded / kernel / WASM.
The week took it from v0.1.3 to v0.3.1, broad and then
hardened:
- Collation — data-driven tailoring from official CLDR rules,
an alphabetic index (
index_labels/index_bucket), and primary-strength string search (find/contains). - Transliteration — set quantifiers and match-references in the transform engine, plus Armenian and Georgian romanizations.
- Number spell-out — RBNF ordinal spell-out (
spell_ordinal), with adversarial-input stack-overflow fixes. - IDNA — full IDNA2008 validity (CheckBidi / ContextJ /
Hyphens / V1 / V5 / V6),
VerifyDnsLength, and an honest conformance bar. - Spoof detection — UTS #39 confusable skeletons and
single-script resolution via
Script_Extensions.
Then a dedicated security pass closed a row of denial-of-service
holes the conformance work surfaced: quadratic scans in collation
and Bidi made linear or O(1), memoized sentence-break lookahead,
Punycode decode-bomb and domain-length caps, MessageFormat
recursion-depth caps, and checked arithmetic across POSIX TZ
offset parsing and GMT-offset formatting (several latent
i32 overflow panics).
timezone-data + strtotime
Two smaller no_std crates landed in the same family, both ports of existing Go libraries:
timezone-data
(a port of gotz) exposes the
IANA timezone data most libraries keep private — transitions, zone
types, POSIX TZ rules, leap seconds. All 598 zones are pre-parsed
at build time into static Rust objects and embedded, so there's no
runtime parsing, no build script, no host-tzdata dependency, and
zero external crates: a lookup is a binary search over &'static
data, allocation-free.
strtotime (a
port of strtotime,
which mirrors PHP's strtotime()) parses PHP-style date/time
expressions into a Unix timestamp without allocating — tokens
borrow slices of the input, the working set lives on the stack. It
validates against a 669-case corpus captured from real PHP and
matches on every one, including DST-sensitive results, fall-forward
across DST gaps, and extreme-year int64 overflow. DST awareness
comes from timezone-data; microsecond sub-second fractions landed
too.
rsurl: curl parity, reached
rsurl — the pure-Rust
curl on purecrypto + compcol — spent the week closing its
feature-parity roadmap and declared functional curl parity
complete under the no-C invariant (v0.0.4 → v0.0.6). The
landings, roughly in roadmap order:
- Protocols — SMTP/SMTPS sending and a minimal TELNET client, on top of the 14 from last week.
- Auth — HTTP Digest, AWS SigV4 request signing
(
--aws-sigv4),--oauth2-bearer. - TLS — mTLS client certs, public-key pinning,
--capath/--crlfile(CRL revocation), and--tlsv1.x/--tls-maxversion pinning, on both TLS backends. - Transfers — streaming HTTP/1.1, HTTP/2, HTTP/3 and FTP/FTPS
downloads to a sink (so bodies never fully reside in memory),
streaming decompression,
-Z/--parallelconcurrent transfers, URL globbing ({a,b}and[1-100]ranges),--unix-socket,--connect-to, redirect controls, and the retry family. - CLI surface —
-wphase timers and%header{}interpolation,--json, low-speed abort (-y/-Y),-Kconfig files,--nextmulti-operation, plus ten more libcurl-shaped options across the C ABI and aman/rsurl.1page.
puressh: ext-info and more zeroize
puressh (v0.0.4)
added RFC 8308 extension negotiation — the ext-info-c / ext-info-s
markers in KEXINIT, sending and accepting SSH_MSG_EXT_INFO at the
legal protocol moments, and preferring the server's advertised
server-sig-algs for public-key auth. It also continued the
security grind: zeroizing the ChaCha20-Poly1305 K_1/K_2 and
Poly1305 one-time key, wiping DH/ECDH shared-secret scratch,
clamping pcssh_sftp_read copies to the caller's cap, and a
signal-safe termios restore in the raw-mode guard. It moved onto
the current crypto floor too: purecrypto 0.2 → 0.6.1.
OxideAV: a renderer joins, and a terseness sweep
oxideav-render is
new — a pure-Rust 3D-scene → raster image/video renderer for the
framework. It came up in phases over the week: a Renderer trait
+ backend stub, a scanline backend behind make_renderer(Scanline),
a named-backend RenderRegistry, and a RenderSource: FrameSource
bridge so a rendered scene plugs into the transcode pipeline as a
source (the raycaster is still pending). It's wired into the
workspace under cli-convert + mesh3d.
The weekly multi-agent codec sweep continued (rounds → 253), and the workspace README got a "full-table terseness sweep" — stripping the per-round changelog chains and collapsing the 100%-done rows, now that so many crates are saturated.
Next week
intl keeps filling in the ICU surface above the Unicode core; timezone-data and strtotime settle their APIs. rsurl, parity reached, turns to hardening and the long tail of compatibility no-ops. puressh keeps filling in state machines past scaffolding. oxideav-render starts on the raycaster behind the scanline path.