Infinitely Many Consumers

When you’re writing shared code, do you ever consider how many developers might end up using it? Just a few? Tens? Hundreds? You might be building a library, which could drive that number even higher - thousands, tens of thousands, more. What if I told you, your code will have infinitely many consumers, who will use it in infinitely many ways.

Murphy’s Law states that “if something could go wrong, it will go wrong”. In software engineering we often invoke it when looking at systems. If a system is brittle and flawed, those flaws will come out eventually and end up paging someone at 2am on a Sunday. We can apply it more granularly to code-level abstractions, too. If a piece of code could throw an Exception, it probably will at some point.

Murphy’s Law and API Design

Murphy’s Law applies to most settings if we think hard enough - what about API design? When writing shared code, it’s easy to think about the immediate context: I work with five other developers in a single codebase, and we all want to be able to access some kind of functionality. I'll probably think about how my abstraction scales: The company onboards another 20 developers who might want more from it - is it fit for purpose in that case? If the context is a big enough organisation, or on open-source projects, I might start having to broaden your thinking to consider multiple projects, hundreds of developers, different languages.

The more developers you have using your code, the higher the likelihood that someone, somewhere, will use it in an unintended way. This is Murphy’s Law applied to shared code - if your code can be used incorrectly, it will be, eventually. This misuse will slow your consumer’s down, and you’ll end up with bug reports, as well as many, many DMs.

When designing shared abstractions, I’ve taken to expanding my view of the ecosystem it’ll exist in to be as wide as possible. If my code will have infinitely many consumers, any code which is prone to misuse will cause a smaller infinity of bug reports.

An example: Powerful higher-order functions

In my team at work, we build libraries to support a bunch of common functionality in Android and iOS apps across the company. We’ve recently built a Telemetry library - a general-purpose pipeline for sending nebulous event data to any number of places.

Our first attempt at a design ended up producing a type called a Tracker. This tracker is just a function under the hood, taking Event.Details (the nebulous data), and returning a Receipt (where that data went). We ended up with a concept of a "modifier" - a function that modifies that nebulous data before it gets to where it was going. This let us write useful functionality on top of it - stream processing functions like filter and distinct . All kinds of functionality could fit into the generic (Event.Details) -> Event.Details shape we’d produced.

These modifiers seemed great. If a developer needed more than the out-of-the-box stream processing functions, they could write their own and plug it in. In testing however, we noticed a problem - it was possible to do unexpected, or even adversarial things with these modifiers. If I wanted to troll a colleague, I could install a modifier on their tracker which throws away all their event details. It was clear that if we had enough developers writing their own modifiers, someone was bound to write one incorrectly.

API Holes, and how to plug them

I’ve taken to calling these broad and potentially error-prone patterns “API holes”: gaps in your API that lets a consumer get themselves into trouble. Our event modifiers made it easy to do more than you probably expect, so we classed them as API holes! Upon realising the hole we'd exposed, we considered it's root cause, and came to the conclusion that our modifier wasn't specific enough.

Information that goes into and comes out of a modifier (Event.Details) is nebulous - it contains all manner of different types that the implementer of the modifier can't know about. User-facing modifiers should only be able to operate on types they know about. Let's look at an example.

Earlier I mentioned stream processing functions, which is something we found valuable in our Telemetry library. We were in need of a filter function: something that can drop portions of the nebulous Event.Details given a precondition. A problematic implementation of a filter function might look a little like this:

fun Tracker.filter(predicate: (Event.Detail) -> Boolean): Tracker

At the call site, developers would then be able to type-check the detail they care about dropping and performing any extra logic:

tracker.filter { detail -> 
    detail is ThingToFilter && detail.someCondition() 
}

While this looks elegant enough, it would be possible to filter too many things:

tracker.filter { detail -> 
    detail is ThingToFilter || true // Whoops, no more details
}

Even though the predicate here only references one type explicitly (ThingToFilter), it is able to drop details it doesn't explicitly reference! Turns out, this is problematic when tens or hundreds of developers are piping their own types through the same Tracker.

A better implementation which is more specific would look like this:

fun <Detail: Event.Detail> Tracker.filter(predicate: (Detail) -> Boolean): Tracker

When a developer then calls this function they have to specify exactly the type they want to operate on. The system doesn't allow them to step outside of that typed boundary:

tracker.filter<ThingToFilter> { detail: ThingToFilter -> 
    detail.someCondition()
}

The more specific you can be with your public API, the more expected the behaviours are, and the lower the opportunity for misuse becomes!

Given this realisation, we made modifiers internal, and only exposed simple, well-tested, predictable and specific stream processing functions. This cut down the surface area for misuse, and gave us more confidence that everything we offer works only as intended.

Conclusions

When you’re writing shared code, assume an infinite number of developers will use it. Use that view to scrutinise your public API. Plug any API holes you’ve inadvertently created, or if they’re necessary, sign-post the risks associated with using them. You’ll never have infinitely many consumers, but if you design your APIs with infinity in mind, it’ll never not scale.

Jamie Sanson

Jamie Sanson

London