Asynchronous task execution at scale

Where asynchronous task execution fits into a larger system is something I’ve been pondering for a while. It is an interesting type of problem, which exists in an unusually strong intersection of operability, guarantees, and fundamental system design principles. Much like system design itself, two extremes bookend the spectrum of common solutions.

On one end, we treat asynchronous task execution as a purely local concern. In this model, every framework, subdomain, or even service implements its own asynchronous task execution that is just a part of how the subsystem runs. Operations equally become a local concern.

On the other end, asynchronous execution is a fundamental property of the entire system. We tightly integrate frameworks and use them as first-tier hosts of the system. If that description is too abstract, Celery is the Python world’s canonical example. This design is familiar to most as the de facto approach in most monoliths1. While tightly coupled, it does offer some benefits as compared with the localized design. Most prominently, it makes centralizing operations and tooling natural2.

In between these two extremes lies a myriad of possible designs. In reality, though, trade-offs become very nuanced. This work is therefore rarely released, and little has been published about this topic. That only makes it altogether more exciting that Dropbox yesterday published a design paper on their solution to this problem at scale. More exciting still, they went with my current personal local optima: a loosely coupled service that relies on well-defined boundaries rather than implementations to execute tasks as if they were any other type of invocation.

This design beautifully centralizes operations of scheduling and invocation while leaving decisions about and operations of execution localized — in other words, localizing operational expertise and system impact. Specific implementation details, such as the polling-heavy design, clearly result from a particular set of trade-offs. But, at a high level, I genuinely believe this design offers a clear improvement over most approaches I’ve seen in the wild, namely by separating concerns, both technically and organizationally. While nuances remain nuances, it would be truly exciting to see this shape of solution generalized well and open-sourced as a component in our shared toolbox.

Regardless, it’s incredibly exciting to see companies publish their solutions to very big design problems.


  1. At times, this is the implementation approach in a system where asynchronous task execution is solved locally. The big difference is obviously that the implementation is not centralized. When blowing out a monolith, this is a piece that is often kept stable and just localized instead. Dozens of Celery clusters are a special kind of scary to me. 

  2. Centralized along with everything else, of course. 

November 12, 2020 |

Stay in the game

I have referenced this story to a few people over the last month or so. There isn’t any one passage that stands out. It’s just a simultaneously incredibly frightening and heartwarming story about the nature of humans.

August 5, 2019 |

Better by default

Going through some old notes on technology choices, I came across Jason Moiron’s excellent commentary on iron.io’s performance gains from their switch to Go from a few years ago:

People have commented that these savings could have been gained from writing critical sections in C or by going over the original code with an eye for performance. Putting the obvious parallelization benefits that Go has over most other languages aside, the point they’re missing here is that you more or less achieve these results by writing normal Go code.

While I still think Go lacks a lot in terms of language features to make productivity a priority, it’s hard to dispute the fact that for a number of cases, Go is better by default. While my notes took Moiron’s point to heart, I still chose Python for the bulk of the backend work of Audacious for now, for that exact reason. There are presently a few services written in Go for critical paths, but for the most part, Go is incredibly unproductive, especially in the context of simple Web service related things, which are, by and large, CRUD with one or two bells or whistles.

June 12, 2019 |

Women in the room

Zach Holman makes a very solid point in the wake of the most recent wave of realisations that the tech industry is filled with misogynistic shitheads. The “throw women at the problem” approach to fixing the tech industry simply isn’t the right way about it, and it obviously isn’t working. It has to start somewhere else.

Fundamentally, the assholes who ruin it for everyone else need to either go, or change their behavior in major ways and own up to their past mistakes in a genuine manner. I must say, I’m pretty pleased to see a few of the most gnarly examples I’ve had the chance to interact with get theirs in recent time – including a few who have been smelling their own farts hard in public.

However, while this is all well and good, it shouldn’t be the women’s job to make this happen. There are enough “good guys” in this industry, that we should be able to call these people out and solve it within our own “ranks1.” Or at least, that’s how it should be – and we should kind of all be embarrassed that this hasn’t been the case. Sure, that gun-wielding founder is mercurial and scary and all, but whatever happened to doing the right thing? The industry as a whole spends so much time talking about “changing the world for the better,” that surely, this must be part of the agenda too.

I mean, the whole industry can’t possibly be this deranged… right?


  1. No, I don’t mean that literally. The point here is: it shouldn’t be the job of women alone to call out misbehaving men. 

July 3, 2017 |

A much needed grain of salt

I remember reading Malcolm Gladwell’s “Outliers” a few years back and being unable to shake the eerie feeling that the whole premise was a bit too much “pseudo” and not enough “science.” The book is by all accounts an interesting and somewhat thought provoking read on some of the factors that make up a select few “successful people.” But, most conclusions were drawn from Fox News-esque summaries of research, and it left one wondering how heavily influenced the entire book was by selection bias.

Now, I know the book was never made out to be a scientific study of any sorts, but it seems like Gladwell’s persuasiveness got the better of a lot of readers, as I’ve kept on hearing the fabled “10,000 hour rule” – Gladwell’s rule, that if you put in 10,000 hours of practice at something, you’ll be come an expert at it – stated as fact ever since the book came out. As it turns out, it’s nowhere near as simple as that. In fact, earlier this year, a book excerpt written by Anders Ericsson – the main author of the study, “The Role of Deliberate Practice in the Acquisition of Expert Performance,” Gladwell based his rule on – saw the author explain the extent to which Gladwell is incorrect (spoiler alert: Gladwell took a single average and made it a rule.) Especially the deconstruction of Gladwell’s The Beatles example is eye-opening, if you didn’t already balk at the example in the book. But, for a wider audience, I think the distinction of influential practice is probably the most important part of Ericsson’s rebuttal:

This distinction between deliberate practice aimed at a particular goal and generic practice is crucial because not every type of practice leads to the improved ability that we saw in the music students or the ballet dancers.

In other words, even though someone has been doing something for 10 hours a day for 20 years, it does not make them an expert – it’s their approach to it, that does. This is not at all to say that I think Gladwell’s work should be dismissed altogether. It’s merely to say, that I wish people would stop and think about what they read and are told, rather than just accepting it all at face value. For now though, the “10,000 hour rule” serves as a good indicator of when to walk away from a conversation, or counter with such equally fascinating and scientific facts as the 8 spiders a year consumed during sleep average.

July 17, 2016 |