Biff September updates: Clojurists Together, documentation, in-memory queues

Two months ago I mentioned I had some plans for adding a bunch of documentation to Biff:

Right now Biff only has reference docs. I want to add a lot more, such as:

  • A series of tutorials that show you how to build some application with Biff, step-by-step. Perhaps a forum + real-time chat application, like Discourse and Slack in one.
  • A page that curates/recommends resources for learning Clojure and getting a dev environment set up. Aimed at those who are brand new to Clojure and want to use it for web dev. If needed I might write up some of my own articles to go along with it, though I'd prefer to curate existing resources as much as possible.
  • A series of tutorials/explanatory posts that teach the libraries Biff uses. [...] This is intended for those who prefer a bottom-up approach to learning, or for those who are familiar with Biff and want to deepen their understanding.

As part of that, I plan to restructure the website, while taking lessons from The Grand Unified Theory of Documentation into account.

This was secretly a copy-and-paste-and-slight-edit of my Clojurists Together application, which has been funded! (The grants were announced the day after my last monthly update went out, which is why I'm mentioning this a little late.) Huge thanks to them and everyone who donates! Also huge thanks to JUXT for their continuing sponsorship of Biff.

Documentation

I mentioned in my application that this is a long-term project (especially the third bullet), and so with the funding I'm mainly planning to complete at least the first bullet (the forum tutorial) along with the website restructuring. And then we'll see how far I get into the other bullet points. They'll happen eventually in any case.

Last month I completed the website restructuring. It's very spiffy. Previously the reference docs were on a big single-page thingy rendered with Slate, and the API docs were rendered with Codox. Now I've written custom code to render both of those alongside the rest of the Biff website. The site is more cohesive now, and it will be easier to add additional documentation sections. Currently there are three sections ("Get Started", "Reference", and "API"); ultimately I plan to have the following sections:

  • Get Started
  • Tutorial (i.e. the forum tutorial)
  • Reference
  • How-To
  • API
  • Background Info (this might have essays about design philosophy, for example)
  • Learn Clojure*

*About the last point: I'm currently waffling over whether this should stay as a single page under the "Get Started" section, or if I should combine it with my plans for "a series of tutorials/explanatory posts that teach the libraries Biff uses" as mentioned above. i.e. if I do actually get around to writing a mini book/course thing that teaches Biff from the ground up (e.g. "here's how to start a new project", "here's how to render a static site," and so on), maybe it will be natural to make it accessible for people who are brand new to Clojure. 🤷‍♂️. No need to make a decision now I guess.

v0.5.0: in-memory queues

I cut a new Biff release:

  • Biff's XTDB dependency has been bumped to 1.22.0.
  • add-libs is now used to add new dependencies from deps.edn whenever you save a file; no need to restart the REPL.
  • Biff's feature maps now support a :queues key, which makes it convenient to create BlockingQueues and thread pools for consuming them:
(defn echo-consumer [{:keys [biff/job] :as sys}]
  (prn :echo job)
  (when-some [callback (:biff/callback job)]
    (callback job)))

(def features
  {:queues [{:id :echo
             :n-threads 1
             :consumer #'echo-consumer}]})

(biff/submit-job sys :echo {:foo "bar"})
=>
(out) :echo {:foo "bar"}
true

@(biff/submit-job-for-result sys :echo {:foo "bar"})
=>
(out) :echo {:foo "bar", :biff/callback #function[...]}
{:foo "bar", :biff/callback #function[...]}

I added these since I have a bunch of background job stuff in Yakread and it was getting out of hand. Especially since Yakread uses some JavaScript and Python code (specifically, Readability, Juice, and Surprise—they're opened as subprocesses, and communication happens over pipes) and I want to make sure there isn't more than one Node/Python process running at a time.

So far I've set up a queue + consumer for doing recommendations with Surprise (with more queues to come next week). Each job it receives has a user ID and a set of item IDs. The consumer opens a Python subprocess which loads the recommendation model into memory, takes in the user ID + item IDs over stdin, and spits out a list of predicted ratings on stdout. The queue consumer keeps the subprocess open until all the jobs currently on the queue have been handled.

Having a priority queue will also be handy. Some of the recommendations happen in batch once per day, to make sure users always have something fresh (made with an up-to-date model) ready to go. But Yakread also needs to make additional recommendations while people use the app. For the latter, jobs can be given a higher priority, so they'll still get done quickly even if we're in the middle of a large batch thing.

(Eventually I'd really like to replace all the Python/Javascript stuff with Clojure code so it takes fewer resources, but it's just not worth it at this stage.)

I wondered about if I should try to make something like yoltq but for XTDB instead of Datomic, so jobs could be persisted to the database, in order to facilitate retries + distributing to separate worker machines. I decided to stick with the current minimal in-memory implementation since that really is all I need personally at the moment. Persistance can be added to these queues from application code, though. All you have to do is:

  1. Instead of calling biff/submit-job directly, save the job as a document in XTDB.
  2. Create a transaction listener which calls biff/submit-job whenever a job document is created.

The sky's the limit from there, I guess. You could:

  • On startup, load any unfinished jobs into the appropriate queues.
  • Add a wraper to your consumer functions which catches exceptions and marks the job as failed (or marks the job as complete if there isn't an exception).
  • Create another transaction listener that watches for failed jobs and puts them into a DelayQueue for retrying.
  • Add a scheduled task that retries any jobs which have been in-progress for too long.
  • Scale out to a degree by creating a separate worker for each queue (there's a :biff.queues/enabled-ids option for this; you can specify which queues should be enabled on which machines).

The big thing you can't do (at least, not well) is have the same queue be consumed by multiple machines. See yoltq's Limitations section, which would also apply to an equivalent XTDB setup. My plan is that when I get to the point where I need more than one machine to consume a single queue (either for throughput or for high availability), I'll just throw in a Redis instance or something and use an already-written job queue library.

As such, while there is more functionality which could be built on top of Biff's in-memory queues, I'm not sure how much of it is really needed. We'll see.

Here's the implementation for anyone who would like to peruse the code.

Roadmap

  1. This weekend I plan to make some more code updates. Mainly I'll replace the task shell script with bb tasks, so that the task implementations can be stored as library code instead of needing to be copied into new projects.
  2. After that I'll work on the forum tutorial discussed above until it's complete. (This might happen mostly in November since baby #2 will arrive in a few weeks.)
  3. Then I'll add various other documentation, like a page with a curated list of resources for learning Clojure, some how-to articles, a reference page for the new queues feature, maybe an essay or two.
  4. Finally get a public Platypub instance deployed, and make some usability improvements in general. Update the GitHub issues, make a roadmap, and write some contributor docs so it's easier for people to help out. 
  5. Take the forum thing mentioned in #2 and turn that into a real-world, useful application like Platypub. Unlike Platypub, I intend the forum to primarily be an educational resource (as the subject of a tutorial), but I do also think it would be fun to have a lightweight Slack/Discourse/Discord/etc alternative to experiment with.
  6. Start working on that "Learn Clojure/Biff from scratch" project I discussed above, unless I think of something better to work on by the time I get through #1-#5.

This should last me well into next year.

Meetups

We had two meetups in September: first we played around with Babashka tasks, and then we attempted to use the Fly Machines API as a sandbox for untrusted code. The second one turned out to be at the precise moment that Fly's Machines API was experiencing downtime, so the latter half of the recording might not be very interesting to watch 🙂.

There will be one meetup in October, on Thursday the 13th (RSVP here). After that, it's up in the air due to the "baby #2" thing mentioned above.

Reminders

  • Come chat with us in #biff on Clojurians Slack if you haven't joined already. Also feel free to create a discussion thread about anything.
  • Again, thank you to everyone who sponsors Biff. If you'd like to help support Biff, please consider becoming a sponsor too.
  • I'm available for short-term consulting engagements; email me if you have a project you'd like to discuss.

Published by Jacob O'Bryant on 3 Oct 2022

Sign up for Biff: The Newsletter
Announcements, blog posts, et cetera et cetera.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.