Arsalan Zaidi's Blog

Leaky abstractions

TLDR: Abstractions are critical for reducing cognitive load, but you still need to know what’s going on under the hood.

The faulty plumbing behind leaky abstractions

Abstractions are critical for a software developer. There’s only so much complexity a human head can hold and the closer you are to the hardware, perversely the less you can do with it. If you’re forced to code right up against the silicon writing machine code, you’ll spend all your time and mental energy just getting simple input/output to work. All of the history of development has been about using the advantages of Moore’s law to climb up the abstraction ladder. We hide the details of how a sub-system or functionality works under a library, framework or language feature so we can ignore the specifics. We step further away from the guts of the system and climb higher up the ladder in order to get a wider view of the landscape. We try to treat functionality as opaque Lego blocks and the bigger the blocks we snap together, the bigger the structures we can build.

But there’s a problem with working with abstractions. They leak. The bones poke through and the details ooze out. As much as we want to ignore the specifics of the implementation and save our mental bandwidth for other work, we’re forced to confront the underlying details.

Drip drip drip…

Let’s take a few examples from real life to illustrate the problem. One of the classics is trying to abstract away the details of dealing with the network. Using RPCs or NFS, we attempt to reduce the complexity of dealing with an often unreliable network infrastructure. However much like the monsters under our beds, the unreliability of the network doesn’t go away just because we decide to close our eyes and hide under the covers. Network calls may take longer than expected or fail completely which means you need to be able to be aware of and handle those error conditions. In order to understand them you need to be familiar with the details of making a call over the network. This defeats the purpose of the abstraction since now you’re exposed to the underlying complexity anyway. Even if things work according to plan 90% of the time, in order to have a robust system you need to handle the 10% when things go belly up.

ORMs are another case in point. The promise is that you can ignore the details of the database and treat it like an object store. Using an ORM makes 80% of your work much easier (and those are the use cases you see in the demos!) but the final 20% can become significantly harder. Edge cases where you need to do complex joins or optimise particularly heavy queries will force you to drop into SQL to solve the problem. Debugging can become markedly more difficult since now you’re one level removed from the actual database and mapping the errors to the bug in your code via the ORM can cause significant hair loss.

Put a bucket under it

So now that we understand the problem, what can we do about it? Other than whining on substack of course! What is the impact this should have on the decisions we make?

We can’t do away with using an abstraction layer or layers. It’s a fundamental part of coding and it’s a very successful approach. However, we need to accept that we will need to dig at least one level down the abstraction stack. We might need to go two levels down in particularly gnarly cases but that’s usually rare. This means we will need to have at least a basic understanding of the technology upon which the abstraction is built and be aware of its limitations and to some extent, its complexities. We’ll have to take a call whether the 80% part of our work which becomes easier is worth the hassle of dealing with the 20% which will become more difficult.

For example, I personally feel that a full fledged ORM like Hibernate gets in my way more than it helps, while something lighter like MyBatis or Spring’s JDBCTemplate is more to my taste. Recently which deciding between Terraform and CloudFormation for my Infrastructure as code needs, I decided to go with the lower level complexity of using AWS Cloudformation directly. The reason? Looming deadlines and no time for shortcuts! I knew that if I went with Terraform instead of CloudFormation, I would have to deal with both the intricacies of both.

Sometimes you just need to bite the bullet and drop down a level to get work done. Abstractions can be really useful, but good luck trying to ignore the piping hidden behind the walls.

#Software Development