Biff 2.0 sneak peak

Jacob O'Bryant | 20 Apr 2026

I have for the past year or two been working on some large Biff changes, such as those discussed in Structuring large Clojure codebases with Biff and Biff support for XTDB v2 is in pre-release. Now that coding agents have gone mainstream (and in particular, now that I personally have started using them heavily), I've had a few more ideas for changes I'd like to make to Biff. And also thanks to coding agents, I've actually been able to make consistent progress instead of my Biff development time being bottlenecked by how late I can stay awake on weekend nights after my kids are sleeping. So we, fingers crossed, are getting close to some major Biff updates, and I figure I may as well slap a 2.0 label on it.

Here's what I've got in the works.

SQLite will be the default database

This is the biggest change. Biff will retain first-class support for XTDB, but it'll also have first-class support for SQLite, and I'll update the starter project to use SQLite by default. There will still be a (non-default) starter project that uses XTDB.

Biff has used XTDB since its (Biff's) initial release in 2020, back when the database was still called Crux. About a year ago I started working on migrating Biff from XTDB v1 to XTDB v2, which brings a whole new architecture, including column-oriented indexes that make analytical queries faster. Besides writing some Biff-specific helper code for XTDB v2, I migrated Yakread (a 10k-LOC article recommender system) to v2 and did a bunch of benchmarking for Yakread's queries. (A big thank you to the XTDB team who responded to lots of my questions during this time and also made a bunch of query optimizations!)

Long-story short: despite the optimizations, I had trouble getting Yakread's page load times to be as quick as I wanted. For the particular queries Yakread runs—which are mostly row-oriented—I've generally found v2's performance to be slower than v1. There is also a larger per-query latency overhead, perhaps another design tradeoff of the new architecture (you can still run v2 as an embedded node within your application process, but it’s designed primarily to be run on a separate machine like more traditional networked databases).

I also will admit that before this benchmarking exercise I had not actually used SQLite much, and I was unaware of how ridiculously fast it is. And one of the main downsides of SQLite when compared to XTDB—that SQLite is a mutable database—is mitigated by Litestream, which streams changes to object storage and lets you restore from (and even run ad-hoc queries on) historical snapshots saved with 30-second granularity.

I could see myself switching back to XTDB at some point in the future. It's still the early days for v2 and the XTDB team is doing lots of work, including on query performance. And SQLite's speed comes with tradeoffs:

Scaling beyond one machine is an unsolved problem. Litefs can let you put SQLite nodes in a cluster where writes get forwarded to a single leader and changes are streamed to the other nodes. However, to use it with Litestream, you have to disable automatic leader failover. So you basically have to choose between HA or PITR.
SQLite only supports a few basic datatypes: ints, floats, strings, and blobs (byte arrays). A large part of my work in integrating SQLite into Biff has been to set up automatic data type coercion so you can use richer types (UUID, boolean, instant, enum, map/set/vector) in your schema without having to do manual coercion when reading and writing.
Litestream's snapshots-at-30-second-granularity is fine for recovering from bad transactions like a DELETE FROM without the WHERE, but it's less helpful than XTDB/Datomic for the debugging-weird-production-issues use case: you can't include a transaction ID or similar in your production logs and then re-run queries with 100% confidence that the results you're seeing are what the application saw when it e.g. threw an unexpected exception.

I was chatting with Jeremy from the XTDB team last week, and he mentioned they've been working on having XTDB ingest changes directly from Postgres. It sounds like it shouldn't be much work to make that work with SQLite too, which means that you could stick an XTDB node alongside your SQLite-powered Biff app and then get more granular historical queries. Maybe XTDB could be a replacement for Litestream?

That could get even more interesting if eventually we can do the inverse as well, where data from our immutable XTDB log could be sent both to a bitemporal index for historical queries and also to SQLite "indexes"/databases for the application servers to use. That would solve the HA problem too.

Anyway. However it happens, I'm looking forward to the glorious future when we finally have an inside-out database that's fast for all query shapes, highly available, models time correctly, and can even do advanced things like let you put a UUID in it. In the meantime, I think SQLite is a reasonable default given Biff's focus on solo developers, and I would absolutely consider XTDB today for situations in which modeling time correctly is a top concern.

Alternate starter projects will get easier

Biff consists of a starter project, a bunch of helper code exposed through a single com.biffweb namespace, tooling for CLI tasks and deployment, and a big pile of documentation. The com.biffweb namespace is on its way out: I'll be publishing Biff helper code as individual libraries like com.biffweb.sqlite (and com.biffweb.xtdb), com.biffweb.authentication, com.biffweb.middleware, com.biffweb.config, etc.

Part of the motivation for this change is that Biff is more mature than it was five years ago and it's become more clear what the different cohesive parts of Biff should actually be. I started out with a single kitchen-sink library because splitting it up felt premature; I didn't think it would realistically make sense to use one of them outside a standard Biff project that would already be depending on all the Biff libraries anyway.

But over the past few months, I've been developing a couple new side projects from scratch without even using Biff. As I've done this, I've started extracting various things into standalone libraries, and this time I do see them as useful libraries in their own right. For example, the new biff.authentication library will be an easy way to add email-based authentication to any Clojure web app that uses Reitit—it even comes with a default sign-in page.

The other factor behind this change is agent-driven development. The difficulty of mixing-and-matching different libraries is dramatically easier now to the point where I wondered briefly if Biff was even needed anymore. Developing those new side projects via agent has disabused me of that notion: agents still need a lot of structure (e.g. in the form of these Biff libraries) to guide them. Even for starting new projects, why have everyone generate a different starter project via some prompt when you could have a single person generate the starter project, make sure it actually works, and then publish that?

That's still a meaningful change though: the effort required to create and maintain new project templates has decreased significantly. So I think it makes more sense for Biff to be split up into multiple libraries that can themselves be mixed-and-matched. I will myself provide Biff starter projects for SQLite and XTDB, respectively. If anyone else wants to make a Biff starter project variant with different library choices, they'll similarly be able to do that without much effort.

For vanity reasons, I'll need to continue having a single "main" Biff repo of some sort (did I mention Biff hit 1,000 github stars recently?). Maybe I'll have that repo be the default starter project.

New approaches for structuring application logic

Two of these Biff libraries that happen to contain some new stuff—instead of being a splitting-out of code that was already in Biff—are biff.graph, which lets you structure your domain model as a queryable graph, inspired by Pathom; and biff.fx, which helps you remove effectful code from your application logic via state machines.

Both libraries help you write purer code (and thus code that's easier to understand and test). biff.graph is a higher-level abstraction that helps with code that reads data. biff.fx is a lower-level thing that I mostly use when writing data. However they're also useful together: e.g. my GET request handlers are typically biff.fx machines that run a biff.graph query and pass the results to the (now pure) rendering code:

(def some-route
  ["/some-page/:id"
   {:get
    (fx/machine ::some-page

      :start
      (fn [{:keys [path-params] :as request}]
        {:stuff [:biff.fx/graph
                 {:stuff/id (parse-uuid (:id path-params))}
                 [:stuff/foo :stuff/bar]]
         :biff.fx/next :render-stuff})

      :render-stuff
      (fn [{:keys [stuff] :as request}]
        {:status 200
         :headers {"content/type" "text/html"}
         :body (render-html
                [:div "foo: " (:stuff/foo stuff)
                 ", bar: " (:stuff/bar stuff)])}))}])

biff.fx provides a defroute macro to make this kind of thing more concise, so the code I actually write looks more like this:

(fx/defroute some-page "/some-page/:id"
  [:biff.fx/graph
   {:params/stuff [:stuff/foo :stuff/bar]}]

  :get
  (fn [request stuff]
    [:div
     "foo: " (:stuff/foo stuff)
     ", bar: " (:stuff/bar stuff)]))

I'll save a fuller explanation for later; hopefully that gives you the flavor of what these libs do.

I've been using Pathom heavily over the past few years, both for work and pleasure. I've started referring to the code structure it enables as “data-oriented dependency injection.” It helps you structure your application in small easy-to-understand chunks that declare exactly what data they need as input and what data they provide as output. The main downside in my experience is that it can be difficult to understand exactly what Pathom is doing and debug when things go wrong.

For “serious” projects, that's a price worth paying. For the kinds of solo projects that Biff is aimed at, I've felt apprehensive about foisting another layer of abstraction on people for code structure benefits that they may or may not notice.

However, my own experience is that even for small apps, the benefit is real. So biff.graph is an attempt to provide the same graph computational model / “data-oriented dependency injection” with as small of an implementation as possible: biff.graph is about 400 lines of code currently, whereas Pathom is closer to 10k.

The main tradeoff I've made in service of that goal is to omit the query planning step that Pathom uses. biff.graph traverses directly over your input query, looking up which resolver(s) to call for each attribute as it goes. For each resolver, biff.graph runs what is more-or-less a separate query to get that resolver's inputs. This hopefully makes biff.graph easier to trace and understand what it's doing, but it also means biff.graph isn't able to optimize the query plan the way Pathom does. (biff.graph does support batch resolvers and caching at least).

biff.fx is more of an original creation. Instead of a single function, you have a map of functions, one for each state. Effects happen in the transitions. You define global “fx handlers” that do things like HTTP requests, database queries/transactions, etc, represented by keywords (e.g. :biff.fx/graph in the example). I’ve changed up the format for describing effects a few times; I think I've finally landed on something that feels ergonomic ([:do-something arg1 arg2] as a replacement for (do-something! ctx arg1 arg2)).

Authorization rules are so back

Biff entered this world as a replacement for Firebase, which I had enjoyed using but left me with the desire for a regular long-lived clojure backend. Firebase lets your frontend submit arbitrary transactions from the frontend, and then they're checked against some centralized authorization rules you define (e.g. “documents in the stuff table can only be edited if the current user's ID is the same as stuff.user_id”). I implemented a similar thing where you would submit transactions in a format similar to Firebase's, then I would translate them to XTDB's transaction format and pass a diff of the database changes to your authorization functions.

I ended up abandoning the SPA approach altogether for server-side rendering (with htmx), and that made authorization rules unnecessary since transactions were originating from the backend: I no longer needed to validate completely arbitrary transactions.

Once again, coding agents have changed the game. When working on mature codebases, of course we all read our generated code carefully before submitting a pull request. But when I've got a new app idea, I want to mostly just vibe code it until I get to the MVP. I'd like to be able to do a light review just to make sure the structure of the code is reasonable. With authorization rules, you can carefully review those central rules in the same way you'd carefully review the database schema, and then you can have confidence that the feature code isn't missing an authorization check. (Of course you still have to make sure the agent didn't bypass the authorization rules...)

This is only for writing data. For reading data, I typically have a few Pathom/biff.graph resolvers that e.g. read an entity ID from the incoming request's path parameters and ensure the user has access to that entity (like the :param/stuff resolver alluded to in the example above). Other related entities are queried as joins against that root entity, so if the authorization check fails, the rest of the query will fail too. So once again you have a way to put authorization logic in a central place that can be reused by your feature-specific code.

oh yeah and datastar

As mentioned above, Biff uses htmx. I like server-side rendering and I think it's a particularly good fit for Biff's solo developer focus. htmx however has a critical flaw: it's too popular. It has 47k github stars—that's half of what Tailwind has.

Datastar fixes this problem by being a much younger project—a niche of a niche. There is a much smaller chance that your colleagues will have heard of it. Datastar also has some smaller but still tangible benefits:

It has some frontend reactivity built in. With htmx, you typically use another tool like _hyperscript or Alpine.js to provide interactivity in cases where you really don't want to wait for a server roundtrip (e.g. a dropdown menu). Datastar has a concept of "signals" baked in so you don't need a second tool.
It has a smaller API surface; much of what htmx offers is replaced by "just use out-of-band swaps." So it might be easier to learn?
It works well for fancy CQRS stuff (still on my list of things to try out).

Of the changes I've mentioned, this one is the most experimental. I actually haven't even made an official decision if I really will switch Biff from htmx to Datastar; at this point I'm just making a prediction that I probably will.

More broadly I would like to explore how far I can push the server-side rendering model before I feel it breaking down. e.g. what approach would I use with it to handle forms with 50+ fields and lots of conditional logic, complex validation logic etc? How about charts? (What I'm getting at: would I regret asking an LLM to migrate our large codebase at work over to htmx/Datastar?).

I’d like to give an honorable mention to inline snapshot testing which I’ve been excited about for a year and a half but now find unnecessary—counterproductive, even—with coding agents. I had started working on some updates to my test code so you could do inline snapshot tests in plain .clj files instead of in .edn files (turns out that tooling support is best when you put your code in files meant for code). But with coding agents, I’ve found that I don’t want tests that auto-update when the actual result changes: it’s too easy for agents to ignore new results that are obviously incorrect. And of course I don’t care if my coding agent finds updating unit tests to be tedious. So the test-related stuff that Biff does will be limited to making your application code more pure so you (your agent) can write dumb (is (= (f x) y)) tests. I might add some structure/patterns for integration tests, though.

Another change driven by coding agents, not a change to the code but a change to my philosophy: I'm more interested in smaller projects. As mentioned, my time for working on personal projects has been extremely limited until a few months ago. I've only ever had a single Biff project at a time that I have attempted to work on regularly; new projects started after the old one failed. So the primary use case I designed Biff for was “serious side projects,” applications that may be solo projects now but will definitely be bringing in a 6-figure income and fulfilling all your entrepreneurial desires at... some point. That one project is the only thing I've ever had a chance of having time for.

Now I can code up an MVP for something over a weekend without ever sitting down at my desk. I built an app that helps me find good Star Wars: Unlimited decks to play. I'm building a blogging platform next. After that maybe I'll build a music recommender system. Or a state legislation tracker/summarizer.

I'm having a blast. Maybe that will affect design decisions I make down the road? I certainly am interested in the use case of doing agent-driven development from a mobile device, so maybe expect something in that area.