Quamina v2.0.0

There’ve been a few bugfixes and optimizations since 1.5, but the headline is: Quamina now knows regular expressions. This is roughly the fourth anniversary of the first check-in and the third of v1.0.0. (But I’ve been distracted by family health issues and other tech enthusiasms.) Open-source software, it’s a damn fine hobby.

Did I mention optimizations? There are (sob) also regressions; introducing REs had measurable negative impacts on other parts of the system. But it’s a good trade-off. When you ship software that’s designed for pattern-matching, it should really do REs. The RE story, about a year long, can be read starting here.

Quamina facts ·

About 18K lines of code (excluding generated code), 12K of which are unit tests. The RE feature makes the tests run slower, which is annoying.
Adding Quamina to your app will bulk your executable size up by about 100K, largely due to Unicode tables.
There are a few shreds of AI-assisted code, none of much importance.
A Quamina instance can match incoming data records on my 2023 M2 Mac at millions per second without much dependence on how many patterns are being matched at once. This assumes not too many horrible regular expressions. That’s per-thread of course, and Quamina does multithreading nicely.

Next? · The open issues are modest in number but some of them will be hard.

I think I’m going to ignore that list for a while (PRs welcome, of course) and work on optimization. The introduction of epsilon transitions was required for regular expressions, but they really bog the matching process down. At Quamina’s core is the finite-automaton merge logic, which contains fairly elegant code but generally throws up its hands when confronted with epsilons and does the simplest thing that could possibly work. Sometimes at an annoyingly slow pace.

Having said that, to optimize you need a good benchmark that pressures the software-under-test. Which is tricky, because Quamina is so fast that it’s hard to to feed it enough data to stress it without the feed-the-data code dominating the runtime and memory use. If anybody has a bright idea for how to pull together a good benchmark I’d love to hear it. I’m looking at b.Loop() in Go 1.24, any reason not to go there?

Book? · It occurs to me that as I’ve wrestled with the hard parts of Quamina, I’ve done the obvious thing and trawled the Web for narratives and advice. And, more or less, been disappointed. Yes, there are many lectures and blogs and so on about this or that aspect of finite automata, but they tend to be mathemagical and theoretical and say little about how, practically speaking, you’d write code to do what they’re talking about.

The Quamina-diary ongoing posts now contain several tens of thousands of words. Also I’ve previously written quite a bit about Lark, the world’s first XML parser, which I wrote and was automaton-based. So I think there’s a case for a slim volume entitled something like Finite-state Automata in the Code Trenches. It’d be a big money-maker, I betcha. I mean, when Apple TV brings it to the screen.

Why? · Let’s be honest. While the repo has quite a few stars, I truly have no idea who’s using Quamina in production. So I can’t honestly claim that this work is making the world better along any measurable dimension.

I don’t much care because I just can’t help it. I love executable abstractions for their own sake.

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

January 20, 2026
· Technology (90 fragments)
· · Quamina Diary (22 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!