Engineering zealotry

Software development is a profession that often attracts those of a logical persuasion. It’s not uncommon for that reasoned view of the world to come with a deep need to find the correct answer to any given problem, even in the greyest of circumstances. It’s a point of view that I certainly started out my career with.

However, the truth is that the more complex the problems become the less likely it is that there’s a “right” answer. More often there’s a series of several perfectly valid options, each with their own drawbacks and benefits. The best choice usually depends on your own unique scenario.

It will solve everything!

The hype train of technology is powerful and never-ending. It’s very easy to be swept up by the waves of people selling the latest panacea - and that fervour doesn’t just apply to the products and ideas being sold, but also to the ways in which those products are built.

Today the waves are built of microservices, cloud and event-driven systems. But in times past it was SOAP, queues and monoliths. Commenting code was considered essential twenty years ago, but today it’s almost blasphemy. Inheritance was a core tenant of programming in the early 2000s, but more recently it’s seen as a problem and has been superseded by composition.

If you’ve only been in the industry a few years it can be easy to see these changes as incremental improvements - we thought that inheritance and code comments were good, but now we know better. It’s far superior to break your code down into adequately small methods: The code should be self-documenting! But what if we’re dealing with highly optimised or complex mathematical code that can’t be sensibly broken out? What if we’re writing code for a largely known, fixed, shallow domain? Context matters a lot when it comes to making these kinds of choices.

Context matters

The current zeitgeist that permeates modern programming is driven by people looking at FAANG¹ and attempting to emulate them without proper context - it’s like car companies copying Toyota in the ’60s. An event-driven, cloud-based, microservice architecture works well for companies that are essentially large websites with colossal amounts of traffic. They are high throughput, low value propositions with a focus on resilience of the overall system in place of reliability of any given transaction. They are companies with a lot of capital to fund large pools of talented developers. They are global and need 100% up-time.

So what does this mean when it comes to system architecture and code design?

First and foremost it means that if a part of the overall ecosystem goes down, it needs to recover quickly or have an instant fallback. This is harder to achieve in a traditional on-prem “pets”² setup where servers are often heavy-weight and hand-built. In a cloud deployment using lightweight docker containers and infrastructure-as-code (IaC) you can quickly - and often automatically - drop the failing server, switch over to a new one, and then replace the old one. This approach also enables zero downtime deployments, by rolling changes out to back-up instances and then quickly flipping the active instance; something which makes achieving 24/7 uptime achievable. The trade-off for this is a large increase in complexity of a typical developer’s role³ which in turn either slows down development velocity, or increases the quality of the average developer you need in your team.

Secondly, an event-driven system is a great way to glue an architecture which relies on horizontal scaling together without increasing the cognitive burden on developers too much⁴. A proper event-driven setup also enables quick reconstruction of the application state, a benefit that plays well with a disposable server approach. However, the inherently unreliable nature of events forces any given developer to carefully consider the idempotency of their application, which pairs well with multi-instance setups that need to scale horizontally. The drawback of events is that most eventing frameworks don’t guarantee sequential delivery and can often send duplicates. Some even don’t provide guaranteed delivery. Their temporally-chaotic nature also tends to result in applications that make tracing a single transaction through a system very difficult, particularly without additional custom tooling.

Thirdly, the large number of developers that work within FAANG companies creates a major social problem: how do you handle large numbers of engineers making overlapping code commits to the same code base? Events help somewhat with this, but microservices are the real technical solution to this managerial issue. Heavily componentising parts of the overall system and having them work (and deploy) independently removes the merge hell that is caused by large numbers of developers all working on branches at the same time. The main drawback of microservices is that they slow the system down with more network calls. They can also create large change dependency chains between teams if taken too far.

So why am I going on about all of this?

It’s because these approaches work well for the problems that these specific companies face - to the likes of FAANG, the benefits outweighed the drawbacks. But there are a lot of developers out there who’ve heard third-hand that CQRS, cloud and microservice architectures are the future and have willingly accepted that as a fundamental truth. The reality is that they are one approach of many. And they’re an approach that may not be the right one for your specific situation.

A return to the old ways

I come from a background in game development. It encourages a style of programming that’s very different from a lot of other sectors. Born from the fact that you only have 15 milliseconds to complete your entire program if you want to run at 60 frames per second, we favour languages like C++. And if we do give up our control over memory management to stray into the realms of something like C#, we certainly don’t use things like LINQ.

In the gaming community over the last five-or-so years there has been a groundswell of resistance against the heavy abstractions and syntactic sugar that most modern developers would encourage. If you’re particularly in touch with the programming community you may have seen signs of it too. People like Casey Muratori and Jonathan Blow⁵ view most abstractions as wasteful, and excessive function nesting as needlessly unperformant. There is a view that modern programmers are wasteful and have ridden on the coattails of hardware performance improvements to their own detriment. As such, they talk of a style of engineering which is far closer to memory-and-cpu-conscious practices of the early ’90s than to anything you would commonly find today.

Outside of FAANG there are plenty of companies finding success with approaches that run counter to the norm too. In gaming the concepts of data oriented design are far more prevalent, with an emphasis on structuring code to preventing cache misses and increase performance. This is most evident with Unity’s move towards their DOTS system and Burst compiler. There have been a recent wave of companies in-and-around the finance sector moving back to on-prem or hybrid from cloud because of the cost and complexity that cloud brought with it. There are even people starting to discuss the idea of favouring procedure calls between modules over microservices, as companies that blindly adopted the approach start to realise the impact that having to plug several thousand tiny services together has on their end-to-end transaction times - this could easily foreshadow a move back to something that more closely resembles monolith architecture in the next five to ten years for a reasonable number of companies.

Tools, not rules

There will always be plenty of people out there that will tell you that their way of doing things solves everything. Maybe it does for them. Maybe it will for you right now. But it’s far healthier to view each approach as a new tool on your belt and not an unbreakable tenant.

When someone professes to you that everything will be perfect just so long as you implement this one idea, take some time to really evaluate the pros and the cons before you dive headlong into adopting it. It’s even worth looking into the biases of the person that’s espousing the concept, so you can understand why they’re so keen on it.

In the end, there are many ways to be an effective engineer and there always will be. So just because you’re not doing it the way everyone else aspires to, it doesn’t mean you’re doing it wrong.

In case you’ve not come across the term, it refers to Facebook, Apple, Amazon, Netflix and Google. You may also prefer the more modern but less audibly-appealing MAMAA/FAAMG. MAMAA removes Netflix in favour of Microsoft, and uses the parent companies Alphabet and Meta in place of Google and Facebook. FAAMG simply switches Netflix for Microsoft. ↩︎
Cattle versus Pets is a common term for two different approaches to server maintenance. Pets are servers that are pampered. If they go down, you do everything you can to bring them back up again. You have only a select number. Cattle are disposable. If a cattle server goes down, you delete it and just get a new one. ↩︎
The other option here is to fragment the role. Developers just handle writing code. DevOps handle deployments and infrastructure-as-code. Support handle any post-deployment issues. The trouble here is that each individual in the team now has a narrower view of the whole process, leading to a need for handoff between teams in order to solve problems or deliver on projects. In turn, this usually leads to a slow down in the team’s deliveries. ↩︎
There is also a fairly common argument that events help decouple systems. This isn’t something I agree with, but that needs a whole other post to talk about… ↩︎
If you’ve not heard of either of these guys before, then you should definitely take a look on YouTube for some of their talks. They’re both extremely talented developers with a strong enough will to stand against the crowd if they think it justified. ↩︎